DSO 


Structural  Change  and  Interaction  Behavior  in 
Multimodal  Networks 

Technical  Report 


Dr.  Loo-Nin  Teow,  Xinghao  Pan,  Wen-Haw  Chong,  Belinda  Wei-Shan  Toh 

DSO  National  Laboratories 

Prof.  Ee-Peng  Lim,  Asst.  Prof.  Jing  Jiang,  Dr.  Byung-Won  On 
Singapore  Management  University 


July  30,  2010 


1 


Report  Documentation  Page 


Form  Approved 
OMB  No.  0704-0188 


Public  reporting  burden  for  the  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  the  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information, 
including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 
VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 


1.  REPORT  DATE 

30  JUL  2010 


2.  REPORT  TYPE 

Final 


4.  TITLE  AND  SUBTITLE 

Structural  Change  and  Interaction  Behavior  in  Multimodal  Networks 


6.  AUTHOR(S) 

Loo-Nin  Teow;  Xinghao  Pan;  Wen-Haw  Chong;  Belinda  Wei-Shan  Toh; 
Ee-Peng  Lim 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

DSO  National  Laboratories, (Singapore  Management  University), 20 
Science  Park  Drive, Singapore, SN, 118230 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

Asian  Office  of  Aerospace  Research  &  Development,  (AOARD),  Unit 
45002,  APO,  AP,  96338-5002 


3.  DATES  COVERED 

17-07-2009  to  29-07-2010 

5a.  CONTRACT  NUMBER 

FA23860914124 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

5d.  PROJECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 

N/A 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

AOARD 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

AOARD-094124 


12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

14.  ABSTRACT 

This  work  presents  the  results  of  research  focused  on  mining  information  from  multi-network  interactions 
for  the  purpose  of  link  prediction.  Multi-networks  are  a  generalization  of  multimodal  networks. 
Multi-network  link  prediction  was  evaluated  on  the  HEP-th  (theoretical  high-energy  physics)  authorship 
multinetwork.  Achievements  include  1)  a  novel  iterative  procedure  for  estimating  unified  multinetwork 
node  similarity  based  only  on  the  network  structure  information;  2)  label  propagation  algorithm  to 
perform  adjacency  propagation  through  the  similarity  matrices  to  produce  a  ranking  of  potential  new 
links.  The  work  also  researched  modelling  engagingness  and  responsiveness  behaviors  in  email  networks 
and  messaging  networks.  Several  quantitative  models  for  measuring  user  engagingness  and  responsiveness 
behaviors  were  defined. 


15.  SUBJECT  TERMS 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 

18.  NUMBER 

19a.  NAME  OF 

ABSTRACT 

OF  PAGES 

RESPONSIBLE  PERSON 

a.  REPORT 

unclassified 

b.  ABSTRACT 

unclassified 

c.  THIS  PAGE 

unclassified 

Same  as 
Report  (SAR) 

64 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Contents 


Introduction .  3 

Part  A:  Structural  Change .  5 

Multimodal  Node  Similarity  for  Link  Prediction 

Part  B:  Interaction  Behavior .  31 

Mining  Interaction  Behaviors  from  Information  Exchange 
Networks 


2 


Introduction 


Background 

This  is  a  DSO  National  Laboratories  project,  funded  by  DARPA  through 
AOARD  (award  number  FA2386-09-1-4124),  with  Singapore  Management 
University  as  a  sub-contractor.  It  is  a  one-year  project,  officially  starting  on  14 
July  2009. 

Objective 

The  goals  of  this  project  are  twofold: 

1.  Structural  change:  To  perform  analysis  of  structural  change  in  multimodal 
networks  across  a  variety  of  domains  in  a  unified  framework,  with  the 
eventual  goal  of  developing  a  multimodal  link  prediction  algorithm. 

2.  Interaction  behavior:  To  investigate  characterization  and  measurement  of 
behavior  multimodal  networks  (e.g.  engagingness,  responsiveness,  etc.). 

Achievements 

Structural  change: 

Our  research  focused  on  mining  information  from  multi-network  interactions 
for  the  purpose  of  link  prediction.  (We  view  multi-networks  as  a  generalization 
of  multimodal  networks.)  We  evaluated  our  multi-network  link  prediction 
algorithm  on  the  HEP-th  (theoretical  high-energy  physics)  authorship  multi¬ 
network.  Our  achievements  for  this  part  can  be  summarized  as  such: 

1.  We  proposed  a  novel  iterative  procedure  for  estimating  unified  multi¬ 
network  node  similarity  based  only  on  the  network  structure  information. 

2.  We  extended  an  existing  label  propagation  algorithm  to  perform  adjacency 
propagation  through  the  similarity  matrices  to  produce  a  ranking  of 
potential  new  links. 

3.  We  evaluated  our  link  prediction  algorithm  with  the  real-world  HEP-th 
dataset,  and  demonstrate  the  ability  of  our  algorithms  in  exploiting  multi¬ 
network  information  for  the  purpose  of  improving  link  prediction 
performance. 

Interaction  behaviors: 

For  this  part,  we  focused  on  modelling  engagingness  and  responsiveness 
behaviors  in  email  networks  and  messaging  networks.  The  Enron  email  data 
and  MyGamma  Social  Network  Message  data  were  used  as  the  target 
datasets.  Our  achievements  for  this  part  can  be  summarized  as  such: 

1 .  We  defined  several  quantitative  models  for  measuring  user  engagingness 
and  responsiveness  behaviors  prevalent  in  email  networks.  We  then 
adapted  these  models,  and  also  developed  new  models  for  messaging 
networks. 

2.  We  have  applied  the  respective  models  to  the  Enron  email  network  and 
MyGamma  messaging  network.  Comparisons  between  engagingness  and 
responsiveness,  and  comparisons  between  different  models,  were  made 
using  these  real-world  datasets. 
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3.  We  introduce  email  reply  order  prediction  as  a  novel  task,  and  show 
experimentally  using  the  Enron  data  that  the  user  behaviors  are  useful 
features  in  the  prediction  task. 

4.  We  finally  show  that  engaging  and  responsive  users  play  important  roles 
in  messaging  topics  within  the  MyGamma  online  community,  specifically, 
major  topics  in  the  community  are  driven  by  engaging  and  responsive 
users. 
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Part  A:  Structural  Change 

Multimodal  Node  Similarity  for  Link  Prediction 
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1  Introduction 

In  recent  years,  the  study  of  networks  has  been  receiving  a  considerable 
amount  of  attention  by  researchers  from  diverse  fields  such  as  sociology, 
physics,  biology  and  computer  science.  It  is  increasingly  recognized  that 
many  real-world  domains  are  highly  relational  in  nature,  as  entities  do  not 
exist  in  isolation  but  constantly  interact  with  one  another. 

A  problem  of  particular  importance  is  link  prediction,  specifically  con¬ 
jecturing  the  formation  of  new  edges  between  entities  over  time.  Primary 
applications  include  friend  recommendation  in  social  networks,  and  predict¬ 
ing  collaboration  between  authors  [14].  Link  prediction  has  also  been  applied 
to  market  targetting  [24]  and  movie  ratings  prediction  [19]  among  other  uses. 

One  approach  to  predicting  links  in  a  homophilic  network  [12]  (e.g. 
friendship  or  co-authorship  networks)  would  involve  first  computing  a  simi¬ 
larity  measure  between  every  pair  of  entity,  and  simply  ranking  each  poten¬ 
tial  link  by  the  pair-wise  similarity  value  [14].  However,  such  methods  are 
constrained  to  only  single-relation  homophilic  networks. 

On  the  other  hand,  real-world  networks  are  highly  complex,  often  com¬ 
prising  of  multiple  types  of  entities  and  relationships.  For  example,  cities  are 
linked  by  transportation  routes  in  geographical  networks,  IP  addresses  are 
linked  by  LAN  connections  in  cyber  networks,  and  bank  accounts  are  linked 
by  transfers  in  financial  networks  [25].  In  a  complex  social  network,  people 
are  linked  by  friendships,  family  ties,  superior-subordinate  and  other  rela¬ 
tionships.  Furthermore,  links  between  different  types  of  entities  (e.g.  poeple 
living  in  cities  and  owning  bank  accounts)  facilitate  interactions  between  the 
multiple  networks.  Intuitively,  these  interactions  contains  additional  infor¬ 
mation  about  the  various  entities,  and  can  possibly  be  exploited  for  the 
purpose  of  link  prediction. 
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In  order  to  improve  link  prediction,  we  propose  to  analyze  these  interac¬ 
tions  collectively  in  a  multi-network.  We  then  peforrn  adjacency  propagation 
to  produce  a  ranking  of  potential  new  links.  The  two  key  intuitions  behind 
our  approach  are  as  follows: 

1.  Node  similarity:  Similar  nodes  have  common  neighbors,  and  are  linked 
to  nodes  that  are  themselves  similar. 

2.  Link  preference:  A  node  U  is  more  likely  to  form  links  with  another 
node  V,  if  V  is  similar  to  nodes  to  which  U  is  linked. 

These  intuitions  are  formalized  in  the  later  sections.  The  first  intuition  is 
applied  to  estimation  of  multi-network  node  similarities;  the  second  intu¬ 
ition  is  applied  to  multi-network  link  prediction.  In  using  node  similarities, 
our  approach  can  be  seen  as  being  in  the  same  class  as  the  framework  of 
Liben-Nowell  &  Kleinberg  [14]  for  homophilic  single-relation  networks,  but 
further  improving  the  link  prediction,  and  also  extending  to  the  general 
multi-network  setting. 

The  following  summarizes  the  important  research  contributions  of  our 
work  in  multi-network  link  prediction: 

•  A  novel  iterative  procedure  for  estimating  a  unified  multi-network  node 
similarity  based  only  on  the  network  structure  information. 

•  Extending  the  label  propagation  algorithm  [30,  26]  to  perform  adja¬ 
cency  propagation  through  the  similarity  matrices  to  produce  a  ranking 
of  potential  new  links. 

•  Experimental  results  using  a  real-world  authorship  multi-network,  demon¬ 
strating  the  ability  of  our  algorithms  in  exploiting  multi-network  in¬ 
formation  for  the  purpose  of  improving  link  prediction  performance. 

The  remainder  of  our  report  is  organized  as  follows.  We  first  formulate 
the  problem  of  multi-network  node  similarity  estimation  for  link  prediction 
in  Section  2.  In  Section  3,  we  describe  our  approach  for  both  estimating 
node  similarity  and  link  prediction.  We  then  show  experimentally  that  (1) 
information  encoded  in  other  relations  is  useful  for  improving  link  prediction; 
and  (2)  our  proposed  method  is  able  to  exploit  such  information.  Related 
work  is  presented  before  we  finally  conclude  in  Section  6. 
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2  Problem  Formulation 


2.1  Preliminaries 

We  begin  with  some  definitions  and  notations. 

A  simple  network  G  =  (A,  A)  consists  of  a  set  of  nodes  or  entities 
X  =  {xi, ...  ,xn}  and  an  adjacency  function  A:1x^4{0,1},  such  that 
A(xt,  Xj)  =  1  whenever  there  is  a  link  from  Xi  to  Xj.  We  will  also  treat 
A  €  {0, 1  }nxn  as  an  adjacency  matrix,  such  that  at,j  =  A(xi,xJ). 

A  mode  [27]  refers  to  a  distinct  set  of  entities.  A  multi-modal  network 
consists  of  possibly  more  than  one  distinct  set  of  entities  (e.g.  users  and 
movies  in  a  movie  rating  network). 

A  relation  [27]  refers  to  a  distinct  set  of  links.  A  multi-relational  network 
consists  of  possibly  more  than  one  distinct  set  of  links  (e.g.  ‘is-enemy-of’ 
and  ‘is-friend-of’  in  a  social  network). 

More  generally,  a  multi-network  is  a  multi-modal,  multi-relational  net¬ 
work.  We  denote  such  a  network  with  G  =  {{X\ , . . . ,  Xm},  {Ap^}},  with 
Xp  =  {x'p.i , . . . ,  xP)np}  denoting  the  distinct  modes,  and  Ap^q  :  Xp  x  Xq  e-x 
{0, 1}  denoting  an  adjacency  function  (or  adjacency  matrix  Ap^q  e  {0,  l}nPxnq 
from  the  mode  Xp  to  the  mode  Xq.  A  simple  network  is  thus  a  uni- modal, 
uni-relational  network.  A  multi-network  can  also  be  seen  as  a  composition 
of  multiple  uni- relational  networks  ({Xp,  Xq} ,  Ap^q) 

A  relation  Ap^q  is  an  undirected  relation  if  p  =  q  and  Ap^q  =  Ap^,q. 

A  network  G  =  {{X\ , . . . ,  Xm},  {Ap^q})  is  an  undirected  network  if  every 
Ap^q  is  an  undirected  relation. 

A  node  xqj  such  that  A p^.q(xP:i,xqj)  =  1  is  termed  an  out-neighbor  of 
xPti .  The  set  of  nodes  {xqj  :  Ap^q(xPii,  xqj)  =  1}  is  the  out-neighborhood 
of  xpj.  Conversely,  xp.i  is  an  in-neighbor  of  xqj  if  A p^q(xPti,xqj)  =  1,  and 
the  set  {xPti  :  Ap^q{xP)i,  xqj)  =  1}  is  the  in-neighborhood  of  xqj.  Both 
neighbors  and  neighborhoods  are  defined  with  respect  to  the  relation  Ap^q. 

We  further  introduce  a  temporal  aspect  to  the  multi-network  so  that 
Gt  =  ({At.i, . . . ,  Xi^m},  {At)P-s.q})  denotes  a  multi-network  at  time  t.  The 
subscript  t  will  be  omitted  when  the  time  frame  is  clear  from  the  context. 

2.2  Problem  definition 

In  the  real  world,  multi-networks  evolve  structurally  over  time  through  gain 
and  loss  of  both  nodes  and  links.  In  this  report,  we  are  interested  in  the 
addition  of  new  edges  between  existing  vertices.  In  particular,  we  consider 
the  problem  of  ranking  potential  new  links  for  each  existing  node.  We  term 
this  problem  multi-network  temporal  link  prediction.  Formally, 


Given  <Gt  =  ({Atii, . . . ,  At)M},  {A t,P^q})  at  time  t,  for  each 
node  xpj  G  Xt,p  and  relation  can  we  accurately  rank 

nodes  {xqj  G  Atig  :  A ttP^.q(xPji,xqjj)  =  0}  according  to  the 
likelihood  that  At+i ;p->.q(xPti,xqj)  =  1? 

We  observe  that  real  world  entities  participate  in  various  interactions 
with  different  types  of  entities.  Intuitively,  these  interactions  should  provide 
us  with  more  information  about  the  relation  for  which  we  are  performing  link 
prediction  on.  For  instance,  consider  two  authors  Alice  and  Bob  who  have 
each  collaborated  with  the  same  authors.  Furthermore,  both  Alice’s  and 
Bob’s  publications  have  often  cited  the  same  papers.  With  this  information, 
we  are  inclined  to  think  that  Alice  and  Bob  are  “similar”  in  some  fashion, 
and  possibly  have  overlapping  research  interests.  It  does  not  take  a  great 
leap  of  imagination  to  then  think  that  Alice’s  next  publication  will  likely  be 
at  a  conference  where  Bob  has  previously  published,  and  vice  versa. 

Motivated  by  such  intuitions,  we  explore  the  research  question  of  whether 
information  about  different  interactions  can  be  exploited  to  estimate  a  uni¬ 
fied  similarity  measure  between  pairs  of  entities  of  the  same  type,  for  the 
purpose  of  accurate  link  prediction.  We  term  this  problem  multi-network 
node  similarity  estimation.  Formally, 

Given  Gt  =  . . . ,  {At >p^q}}  at  time  t,  can  we  uti¬ 

lize  information  stored  in  the  relations  {A^p^}  to  estimate  a 
node  similarity  matrix  Sr  for  each  mode 

Hence,  our  concern  is  with  the  two-part  problem  of  first  estimating  multi¬ 
network  node  similarities,  and  then  applying  the  multi-network  node  simi¬ 
larities  to  the  problem  of  temporal  link  prediction. 

We  intend  for  our  approach  to  be  agnostic  to  the  semantics  of  entities 
and  links.  The  real  world  identities  of  entities  and  links  are  withheld,  and 
only  information  encoded  in  the  multi-network  structure  is  exploited  in  our 
approach.  We  also  do  not  make  any  homophily  assumptions.  By  doing  so, 
we  hope  to  generalize  our  approach  to  a  large  class  of  multi-networks. 

We  also  do  not  utilize  any  non-structural  features  of  entities  in  the  multi¬ 
network.  While  we  do  not  rule  out  the  possibility  of  using  non-structural 
features  for  estimating  node  similarities  (see  Section  6),  we  have  thus  far 
focused  on  the  usefulness  of  structural  features  in  multi- networks. 
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3  Approach 


We  break  down  our  approach  into  two  separate  parts  according  to  the  two 
problems  defined  above.  Firstly,  we  describe  how  we  construct  a  unified 
similiarity  matrix  for  each  mode  in  a  multi-network.  Next,  we  explain  how 
the  unified  similarity  matrices  can  be  used  for  link  prediction. 

3.1  Multi-Network  Node  Similarity 

The  problem  of  measuring  node  similarity  in  simple  networks  is  not  new. 
Liben-Nowell  &  Kleinberg  [14]  evaluated  a  number  of  node  similarity  mea¬ 
sures  for  their  effectiveness  in  link  prediction.  We  discuss  two  such  ideas  be¬ 
fore  presenting  our  own  balanced  approach  extended  to  the  multi-networks 
setting. 

3.1.1  Common  Neighbors 

A  direct  way  of  measuring  similarity  of  two  nodes  in  a  simple  network  is  to 
simply  count  the  number  of  neighbors  that  are  common  to  both.  Formally, 
the  common-neighbors  similarity  is  S  =  A  A7  =  A7  A  in  an  undirected 
network  where  A  =  A1 .  (In  a  directed  network,  AAJ  would  define  a 
similarity  based  on  common  out-neighbors.  The  similarity  based  on  common 
in-neighbors  can  be  analogously  defined  as  A7  A.)  Although  simple,  the 
common-neighbors  similiarity  measure  performed  surprisingly  well  in  the 
evaluations  of  [14]. 

It  is  easy  to  extend  this  model  to  a  weighted  form  by  introducing  a 
weight  for  each  node,  so  that  the  (undirected)  common  neighbors  similarity 
is  S  =  AWA7  =  A7  WA,  where  W  is  a  diagonal  matrix  with  diagonal 
elements  wt^  equal  to  the  weights  of  the  corresponding  nodes  X{.  For  in¬ 
stance,  if  we  set  w^i  =  (X^  aj,i)-1,  then  AWA7  would  define  a  similarity 
based  on  common  out-neighbors,  each  inversely  weighed  by  its  number  of 
in- neighbors.  Conversely,  if  we  set  Wij  =  (]>T  ctjj)-1,  then  A7WA  would 
define  a  similarity  based  on  common  in-neighbors,  each  inversely  weighed 
by  its  number  of  out-neighbors. 

A  shortcoming  of  the  common  neighbors  method  for  measuring  node  sim¬ 
ilarity  is  its  inability  to  capture  relationships  that  may  exist  over  multiple 
hops.  The  common  neighbors  method  is  thus  unable  to  exploit  information 
encoded  in  relations  several  hops  away  in  the  multi-network.  (In  the  earlier 
authorship  network  example,  Alice  and  Bob  are  similar  because  their  publi¬ 
cations  cited  the  same  papers;  this  similarity  is  not  captured  by  the  common 
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neighbors  method.)  The  next  method  is  formulated  to  exactly  address  this 
problem  of  multi-hop  similarities. 

3.1.2  Recursive  Neighborhood  Similarity 

The  intuition  behind  recursive  neighborhood  similarity  is  that  similar  nodes 
are  related  to  similar  nodes.  More  precisely,  nodes  Xi  and  Xj  are  similar  if 
they  are  linked  to  nodes  x\  and  x'rj  respectively,  and  x\  and  Xj  are  them¬ 
selves  similar.  The  idea  of  recursive  similarity  is  not  new,  having  appeared 
previously  in  SimRank  [7].  We  present  a  matrix  formulation  for  recursive 
neighborhood  similarity  that  differs  mainly  from  SimRank  in  the  form  of 
normalization  used. 

Let  S  and  S  denote  the  node  similarity  matrix  and  neighborhood  simi¬ 
larity  matrix  respectively.  In  practice,  for  a  simple  network,  S  is  an  unnor¬ 
malized  or  unsmoothed  version  of  S.  The  differentiation  between  the  two 
will  become  clearer  in  the  next  section.  Further  let  A  denote  a  suitably 
weighted  adjacency  matrix.  We  can  then  formalize  the  recursive  neighbor¬ 
hood  similarity  with  the  equations: 

S  =  ASAr  (1) 

S  =  T»(S)-5ST»(S)-5  (2) 

where  D( S)  returns  a  diagonal  matrix  with  diagonal  elements  djj  =  JT  ^ 
This  form  of  normalization  has  been  advocated  in  [18,  13]  for  spectral  clus¬ 
tering,  and  is  also  how  the  graph  Laplacian  is  normalized  in  spectral  graph 
theory  [1], 

We  follow  SimRank  in  proposing  an  iterative  solution  to  estimating  re¬ 
cursive  neighborhood  similarities: 

S(k)  ASMA7,  (3) 

S(fc)  <-  D(S(fc))-5S(fc)D(S(fc))-5  (4) 

where  S ^  and  S ^  are  the  node  similarity  matrix  and  neighborhood  simil- 
iarity  matrix  computed  at  the  fcth  iteration. 

Note  that  the  above  formulation  is  based  on  similarity  of  out-neighborhoods. 
To  compute  the  recursive  in-neighborhood  similarity,  we  would  simply  swap 
A  and  A7  in  the  above  equations. 

Although  this  formulation  of  similarity  is  able  to  capture  multi-hop  re¬ 
lationships,  it  was  demonstrated  in  [14]  that  a  link  predictor  based  on  Sim¬ 
Rank  does  not  perform  as  well  as  one  based  on  common  neighbors. 
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3.1.3  Balanced  Model 


In  their  current  forms,  both  the  common  neighbors  and  recursive  neighbor¬ 
hood  similarity  methods  deal  with  simple  networks,  and  neither  are  appli¬ 
cable  to  the  multi-networks  setting.  We  propose  to  combine  the  two  in  a 
weighted  fashion  and  further  extend  to  multi-networks.  The  key  assumption 
of  our  apporach  is  that  similar  nodes  have  common  neighbors  and  are  also 
linked  to  similar  nodes. 

Let  p  £  [0, 1]  be  a  parameter  controlling  the  balance  between  the  common 
neighbors  and  recursive  neighborhood  models.  We  then  define  the  iterative 
procedure  for  the  balanced  model  as: 

S(fc)  <-  AfpS^  +  (i  -  p)In]AT 

=  pAS^"1)  AT  +  (1  -  p)AAT  (5) 

S(fc)  <-  D(S(fc))"3S(fc)D(S(fc))"5  (6) 

where  A  =  AD(A1  )~2  is  an  adjacency  matrix  with  each  node  inversely 
weighted  by  the  square  root  of  its  number  of  in-neighbors.  In  practice,  we 
find  that  this  form  of  weighting  nodes  works  best. 

By  setting  p  =  0  and  with  an  initial  value  of  S  =  In,  we  immediately 
get  convergence  with  S^1)  =  AA1  =  AD(AT)~1  A1 .  This  is  exactly  the 
weighted  version  of  the  common  neighbors  similarity,  with  the  diagonal 
weight  matrix  W  =  D( AT)-1.  On  the  other  hand,  by  setting  p  =  1,  the 
iterative  update  equations  reduces  to  those  used  in  computing  the  recursive 
neighborhood  similarity. 

We  now  extend  the  balanced  model  to  include  multiple  relations.  First, 
for  simplicity,  for  every  relation  Ap^q,  we  include  the  reverse  relation  Aq^p  = 
Aj,^q  in  the  multi-network.  This  allows  us  to  properly  account  for  similarity 
based  on  both  in-  and  out-neighbours,  without  having  to  explicitly  consider 
both  directions. 

Let  S p^q  be  the  neighborhood  similarity  matrix  with  respect  to  the 
relation  Ap^q.  Also,  let  Sp  be  the  overall  node  similarity  matrix  for  mode 
Xp.  We  can  then  iteratively  compute  the  similarity  matrices 

S<‘i,  <-  Ap^S**-1'  +  (1  -  p)I„]A^,  (7) 

s^d(es?l)  ’  (e s®«)  d (e silh)  1  w 

Essentially,  the  neighborhood  similarity  Sp^.q  is  computed  based  on  the 
common  out-neighbors  and  out-neighborhood  similarities  with  respect  to 
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relation  Ap^q,  and  the  overall  node  similarity  Sp  is  the  smoothed  sum  of 
the  neighborhood  similarity  matrices  Sp^.q. 

The  balanced  model  can  thus  be  thought  of  as  combining  the  common 
neighbors  model  with  the  recursive  neighborhood  similarity  model,  and  ex¬ 
tending  the  combined  model  to  the  multi-network  setting. 

Although  matrix  multiplication  in  general  is  an  computationally  expen¬ 
sive  operation,  we  note  that  real-world  networks  tend  to  have  sparse  ad¬ 
jacency  matrices.  The  sparsity  of  Ap->q  can  be  exploited  to  improve  the 
computational  complexity  of  the  iterative  procedure. 

We  point  out  that  it  is  critical  to  differentiate  between  each  relation  by 

(k) 

separately  computing  each  Sp->q.  A  naive  approach  of  combining  adjacency 
matrices  prior  to  computing  neighborhood  similarities  would  possibly  result 
in  illogical  similarities.  For  example,  Alice  would  gain  a  non-zero  measure 
of  similarity  with  a  journal  she  published  at,  through  virtue  of  having  a 
common  neighbor  in  the  Alice’s  publication  at  the  journal. 

3.2  Multi-Network  Link  Prediction 

We  now  discuss  our  adaptation  of  label  propagation  [30,  26]  for  multi¬ 
network  link  prediction.  In  label  propagation,  class  labels  are  propagated 
from  labeled  to  unlabeled  data  based  on  similarities  between  data  points. 
Given  a  node  xpp .  we  treat  its  adjacent  neighbors  {xqj  :  At iP^.q(xPji,xqj)  = 
1}  as  labeled,  and  non-adjacent  nodes  as  unlabeled,  i.e.  we  replace  la¬ 
bels  with  adjacencies.  By  then  applying  the  label  propagation  algorithm, 
we  essentially  perform  adjacency  propagation  from  adjacent  neighbors  to 
non-adjacent  nodes.  We  can  then  rank  the  non-adjacent  nodes  {xqj  : 
At  .p^.q(xpti,Xqtj)  =  0}  according  to  the  adjacency  information  each  xqj 
received  through  the  propagation. 

More  precisely,  let  us  define  a  function  Fp^  :  Xp  >-)•  ( Xq  e->-  R+)  for  each 
relation  Ap^q.  That  is,  F p^.q(xpp)  is  itself  a  function,  which  returns  a  non¬ 
negative  real  value  for  each  xqj  6  Xq.  We  can  understand  each  Fp^.q(xp^) 
as  a  vector  with  each  entry  indicating  the  relative  likelihood  of  each  node 
xqj  linking  to  the  node  xpj.  We  can  also  represent  Fp^g  as  a  matrix,  with 
the  (i,j)-th  element  equal  to  Fp^>.q(xpp)(xqp). 

We  perform  the  adjacency  propagation  for  relation  Ap^q  using  the  iter¬ 
ative  update  equation: 

Ff],  <-  aS?F(^)  +  (1  -  a)Alp^q  (9) 

Intuitively,  the  neighbors  of  each  xpp  are  the  “sources”  of  adjacencies 
(second  term) ,  which  are  then  propagated  to  other  similar  nodes  (first  term) . 
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The  paramter  a  £  (0, 1)  controls  the  amount  of  adjacency  information  each 
node  xqj  receives  from  other  similar  nodes.  The  computational  complexity 
of  the  iteration  is  dominated  by  the  matrix  multiplication  SqFp^.q  ,  which 
in  general  requires  0(nq)  multiplications  (or  0(?r^'807)  multiplications  with 
the  Strassen  algorithm  [20]).  It  may  be  possible  to  approximate  this  com¬ 
putation  by  first  sparsifying  Sq  such  that  elements  below  a  threshold  are  set 
to  0.  However,  this  approach  was  not  tested  for  this  report,  as  we  were  able 
to  complete  our  experiments  within  reasonable  time. 

A  sufficient  condition  [30]  for  convergence  of  this  iteration  is  that  the 
eigenvalues  of  Sq  are  in  [—1,1]  and  that  0  <  a  <  1.  Now,  following  the  analy¬ 
sis  of  [30,  26],  we  define  the  stochastic  matrix  P  =  D(%2P  S<?-s>p)  1  Yhp  — 

Suppose  A  and  v  are  an  eigenvalue  and 
eigenvector  pair  for  Sq  such  that  Sqv  =  Xv.  Then, 

A  D(Y^  Sq^p)-*0=DQ2  Sq^p)~12Sqv  =  PD(J2sq^pr^v, 

p  p  p 

so  A  and  D(^2pSq^p)~2v  are  an  eigenvalue-eigenvector  pair  for  P.  By  the 
Perron- Frobenius  theorem,  we  know  that  A  £  [—1,1]  as  an  eigenvalue  of 
stochastic  matrix  P,  .  Since  this  holds  true  for  every  eigenvalue  of  Sp^q,  all 
eigenvalues  of  Sp^.q  are  in  [—1,1], 

It  can  be  shown  [26]  that  the  iteration  minimizes  the  cost  function 
2 trace  ^Fp^q(lnq  -  Sq)Fp^  +  j\Fp^q  -  where  |  •  |  denotes  the 

Frobenius  norm,  and  a  =  yyy .  The  closed  form  solution  is  F*_^  =  (1  — 
a) (In,  -aS,)-1^,  [30,  26], 

4  Experiment 

We  evaluated  our  method  for  estimating  multi-network  node  similarity  by 
performing  link  prediction  on  a  well-known  authorship  network.  We  used 
an  average  AUC  (area  under  ROC  curve)  as  our  performance  metric,  and 
demonstrate  that  the  balanced  model  is  able  to  significantly  outpeform  a 
baseline  model  based  on  single-relation  common  neighbors. 

4.1  Dataset 

The  base  dataset  that  we  used  for  evaluation  is  the  Proximity  HEP-th 
database  [8].  The  Proximity  HEP-Th  database  is  based  on  data  from  the 
arXiv  archive  and  the  Stanford  Linear  Accelerator  Center  SPIRES-HEP 
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Table  1:  Extracted  Relations  from  Proximity  HEP-th  dataset 


Relation 

Connects... 

Remarks 

CoAuthorship 

Author-fi-  Author 

Undirected  relation  of  co-authorship, 
equivalent  to  CoAuthored 

APublication 

Author— >  Journal 

Derived  relation  of  locations  at  which 
authors  published 

ACitation 

Author— >Paper 

Derived  relation  of  papers  cited  by 
an  author 

Authorship 

Author— >Paper 

Equivalent  to  Authored 

Affiliation 

Author— >  EmailDomain 

Equivalent  to  EmailAffil 

CommonTopic 

Journal  «->•  Journal 

Undirected  relation,  derived  from 
topic  attribute  of  papers 

PPublication 

Paper— >•  Journal 

Equivalent  to  Publishedln 

PCitation 

Paperi-Paper 

Equivalent  to  Cites 

SubDomain 

EmailDomain— >  EmailDomain 

Derived  relation. 

E.g.  xyz.abc.com  — >  abc.com  — >  com 

ACitation 


C#)imcnTopic  - !■ 


Figure  1:  Graphical  representation  of  the  modified  HEP-th  dataset  with 
extracted  additional  relations. 


database  provided  for  the  2003  KDD  Cup  competition  with  additional  prepa¬ 
ration  performed  by  the  Knowledge  Discovery  Laboratory,  University  of 
Massachusetts  Amherst. 

The  dataset  originally  consists  of  four  modes  (EmailDomain,  Journal, 
Paper  and  Author),  and  five  relations  (Publishedln,  Authored,  Cites, 
CoAuthored,  EmailAff  il).  We  pre-processed  the  dataset  to  extract  a  total 
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of  nine  relations,  while  maintaining  the  original  four  modes.  The  extracted 
relations  are  described  in  Table  1.  Figure  1  shows  a  graphical  representation 
of  this  extended  schema. 

Our  intention  in  extracting  additional  relations  is  to  create  different 
kinds  of  relations.  These  include  undirected  relations  (CoAuthorship  and 
CommonTopic),  static  relations  that  do  not  change  over  time  (PCitation  and 
SubDomain),  relations  with  tree  structures  (SubDomain),  and  multiple  rela¬ 
tions  that  connect  the  same  pair  of  modes  (ACitation  and  Authorship)  but 
have  different  semantic  meaning.  We  are  therefore  able  to  demonstrate  that 
our  solution  for  link  prediction  can  generalize  to  different  types  of  relations. 

Although  the  complete  dataset  spanned  the  years  1900  till  2003,  we 
observed  that  the  bulk  of  data  was  concentrated  in  the  years  1992  till  2002. 
Hence,  we  only  evaluated  the  data  within  this  timeframe,  and  segmented 
the  data  into  11  yearly  time  intervals. 

4.2  Baseline  method 

We  consider  a  baseline  model  which  measures  node  similarity  based  only  on 
weighted  common  in- neighbors.  Thus,  for  prediction  of  relation  Ap^q,  our 
baseline  method  uses  a  node  similarity  matrix  Sq^p  =  A^qD(Ap^.q)^1  Ap^.q 

The  normalized  node  similarity  matrix  Sq^p  =  D(Sq^.p)~^Sq^.pD(Sq^.p)~^ 
is  then  used  for  adjacency  propagation,  as  described  in  Section  3.2. 

Note,  however,  that  the  baseline  model  differs  from  the  balanced  model 
with  p  =  0  applied  to  the  single-relation  network.  If  the  relation  is  directed, 
our  balanced  model  considers  both  common  in-  and  out-neighbors,  whereas 
the  baseline  model  accounts  for  one  but  not  the  other. 

4.3  Performance  metric 

At  each  time  4,  the  adjacency  propagation  algorithm  generates  a  matrix 
F p^q  for  each  relation  Ai^p^q  which  provides  a  ranking  of  potential  new 
links  for  each  node  xpp.  However,  at  time  4  +  1,  we  do  not  observe  a  ranking, 
but  a  set  of  new- adjacencies  Up^.q(xpp)  =  {xqj  :  At  p^q(xp,i^  xqj)  =  0  A 
At+i ,p->q(xp,i,xqj)  =  1}  and  a  set  of  non- adjacencies  Vp^q{xpp)  =  {xqj  : 
At )P^q(xpp,  Xqj)  —  0  A  A —  0}' 

Thus,  we  measure  our  ranking  accuracy  by  the  following  performance 
metric: 

acc(p  -A  q)  =  Y,  iTTTlyT  I]  2  <HF(b  j),  F(i,  A;))  (10) 

1  1  Ip+w  1  1  1  1  xqij£Uxq,kev 
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where  W  —  Wp— >q  —  {%p,i  •  ^•^q,jt^-t,p—:tq{p^p,it-^q,j')  —  ^^-^-t+\,p—^qi^p,ii  ^q,j)  — 
1}  is  the  set  of  nodes  in  mode  Xp  that  have  acquired  new  adjacencies  at  time 
t  + 1;  U  =  Up->q(xpti)  is  the  set  of  new- adjacencies  that  xpp  acquired  at  time 
t  +  1;  V  =  Vp^q{xpp)  is  the  set  of  non- adjacencies  of  xpp  at  times  t  and 
it  +  1,  F(i,j)  —  Fp^-q (xp.i ,  Xq  j j i  and 


(  1  if  fi>  fj 
=  {  \  if  fi  =  fj 
[  0  if  ft  <  fj 


When  all  new- adjacencies  are  ranked  above  all  non- adjacencies  (that  is, 
(VxPti,xqj  e  U,xq^k  £  V),F (i,j)  >  F(i,k)),  the  metric  attains  a  maximum 
of  1,  whereas  when  all  non-adjacencies  are  ranked  below  all  new- adjacencies 
(that  is,  (\/xPti,  Xq  j  £  U,xqjk  £  V),F(z,j)  <  F (i,k)),  the  metric  attains  a 
minimum  of  0.  If  the  ranking  is  perfectly  random  and  totally  uncorrelated 
with  the  new-adjacencies,  then  the  expected  performance  metric  obtained 
would  be  0.5. 

This  performance  metric  can  also  be  interpreted  as  an  AUC  (area  un¬ 
der  ROC  curve1  )  measure.  Consider  a  simple  threshold  binary  classifier 
(DJp  .  (F p^.q(i,  j))  which  returns  a  predicted  value  of  At+i tP->q(xP!i,  a ’q.j),  such 
that 


Cl  (F 

**" D.l  x 


p^q{ 


J)) 


1  ifF  p-+q(i,j)>T 
0  otherwise 


That  is,  for  a  node  xpp ,  the  classifier  CJ  .  predicts  a  formation  of  a  po¬ 
tential  adjacency  with  node  xqj  if  and  only  if  F p^q(i,j)  >  r.  As  we  in¬ 
crease  r  from  0  to  1,  the  true  positive  rate  drops  from  1  to  0,  whereas  false 
positive  rate  rises  from  0  to  1.  The  AUC  for  Cl  is  then  computed  as 
Y2X  fc£v  <5(F(i,  j),  F (i,j)).  Hence,  acc(p  — >  q )  is  the  AUC  averaged  over  the 
set  VVp_s.q. 


4.4  Evaluation 

Evaluation  is  done  for  every  consecutive  pair  of  yearly  time  intervals.  Since 
we  extracted  a  total  of  11  such  time  intervals,  we  were  able  to  evaluate  the 
approach  for  10  pairs  of  consecutive  time  intervals. 

We  also  point  out  that  link  prediction  is  only  be  performed  for  nodes 
which  exist  in  the  earlier  time  interval;  link  prediction  for  nodes  that  do 

JAn  ROC  curve  is  a  graph  of  true  positive  rate  (which  is,  in  our  case,  the  fraction 
of  new-adjacencies  correctly  classified)  against  false  positive  rate  (which  is,  in  our  case, 
the  fraction  of  non- adjacencies  wrongly  classified)  produced  by  varying  a  parameter  or 
threshold  of  a  classifier. 
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not  yet  exist  is  of  lesser  interest  to  us.  For  the  relations  Authorship, 
PPublication,  PCitation  and  SubDomain,  no  new  adjacencies  are  formed 
between  existings  nodes,  i.e.  \U\  =  |W|  =  0.  Thus,  we  do  not  peform  link 
prediction  for  these  four  relations.  Instead,  link  prediction  is  only  performed 
for  the  relations  CoAuthorship,  APublication,  ACitation,  Affiliation 
and  CommonTopic.  In  addition,  the  relations  APublication,  ACitation  and 
Af  f  liation  are  directed  relations,  and  so  we  also  perform  link  prediction  for 
the  reverse  direction  of  these  relations.  In  total,  we  perform  link  prediction 
for  eight  relations. 

For  our  experiments,  we  set  a  =  0.5  and  tested  for  p  =  0,  0.25,  0.5,  0.75, 1. 

We  also  evaluated  our  balanced  node  similarity  measure  for  the  multi¬ 
network  comprising  of  all  modes  and  relations,  and  also  for  the  single¬ 
relations  networks  comprising  of  only  one  relation.  That  is,  for  the  multi¬ 
network,  we  compute  a  node  similarity  matrix  Sp  for  each  mode,  and  then 
use  these  similarity  matrices  for  link  prediction;  for  each  single-relation  net¬ 
work  comprising  relation  Ap^.q,  we  compute  the  node  similarity  matrices 
Sp  and  Sq,  which  are  then  used  for  predicting  the  relation  Ap^.q  and  the 
reverse  direction  Aq^.p  =  A . 

4.5  Results 

In  the  interest  of  space,  we  will  only  show  the  results  averaged  over  the  10 
pairs  of  yearly  time  intervals.  Overall  results  are  presented  in  Table  2. 

We  analyze  the  results  in  this  section,  and  highlight  our  key  observations 
in  bold.  We  remind  the  reader  that  with  p  =  0,  the  balanced  model  essen¬ 
tially  reduces  to  a  weighted  common- neighbors  model,  while  with  p  =  1,  the 
balanced  model  is  purely  a  recursive  neighborhood  similarity  model. 

Balanced  model  with  single-relation  network  performs  at  base¬ 
line.  Table  3  shows  the  improvements  in  average  ranking  accuracy  that 
are  obtained  over  the  baseline  when  the  balanced  model  is  used.  For  p  = 
0, 0.25, 0.5,  0.75,  the  difference  in  average  ranking  accuracy  between  the  base¬ 
line  and  balanced  model  is  negligible.  The  low  standard  deviation  shown 
in  Table  4  indicates  that  the  negligible  difference  is  consistently  observed. 
Thus,  in  the  absence  of  additional  information  from  other  relations,  our  bal¬ 
anced  model’s  performance  is  neither  significantly  better  or  worse  than  the 
baseline  weighted  common  neighbors  method. 

We  do  note  that  for  p  =  1,  the  pure  recursive  neighborhood  similarity 
model  is  susceptible  to  worse  performance  (see  ACitation  and  the  reverse 
ACitation).  This  is  consistent  with  the  results  obtained  in  Liben-Nowell 
&;  Kleinberg  [14],  where  a  common-neighbors  link  predictor  outperformed 
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Table  2:  Average  Accuracy 


p 

^relations 

CoAuthor- 

ship 

Affiliation 

ACitation 

APublication 

Baseline 

One 

61.89% 

53.11% 

69.64% 

87.05% 

0 

One 

61.88% 

53.11% 

69.64% 

87.05% 

All 

79.11% 

57.63% 

81.48% 

86.09% 

0.25 

One 

61.91% 

53.11% 

69.68% 

87.25% 

All 

79.82% 

69.84% 

82.11% 

86.89% 

0.5 

One 

61.91% 

53.11% 

69.66% 

87.49% 

All 

80.11% 

70.09% 

81.95% 

87.31% 

0.75 

One 

61.91% 

53.11% 

69.44% 

87.75% 

All 

80.12% 

70.20% 

81.04% 

87.65% 

1 

One 

61.88% 

53.11% 

63.96% 

87.78% 

All 

71.08% 

68.75% 

67.99% 

87.86% 

Maximum 

80.12% 

70.20% 

82.11% 

87.86% 

P 

^relations 

Common- 

Topic 

Affiliation 

(reverse) 

ACitation 

(reverse) 

APublication 

(reverse) 

Baseline 

One 

55.74% 

52.09% 

74.92% 

56.85% 

0 

One 

55.74% 

52.09% 

74.92% 

56.85% 

All 

65.79% 

75.62% 

76.05% 

62.35% 

0.25 

One 

55.75% 

52.09% 

74.97% 

57.06% 

All 

66.66% 

77.08% 

76.49% 

63.40% 

0.5 

One 

55.75% 

52.09% 

74.98% 

57.33% 

All 

67.57% 

77.30% 

76.64% 

64.04% 

0.75 

One 

55.75% 

52.09% 

74.80% 

57.69% 

All 

68.94% 

77.33% 

76.56% 

64.48% 

1 

One 

55.75% 

52.09% 

68.25% 

57.67% 

All 

70.14% 

68.80% 

68.82% 

62.30% 

Maximum 

70.14% 

77.33% 

76.64% 

64.48% 
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Table  3:  Improvement  in  Average  Accuracy  (with  Single  Relation)  Over 
Baseline 


p 

^relations 

CoAuthor¬ 

ship 

Affiliation 

ACitation 

APublication 

0 

One 

0.00% 

0.00% 

0.00% 

-0.01% 

0.25 

One 

0.02% 

0.00% 

0.04% 

0.20% 

0.5 

One 

0.02% 

0.00% 

0.02% 

0.44% 

0.75 

One 

0.02% 

0.00% 

-0.20% 

0.70% 

1 

One 

0.00% 

0.00% 

-5.68% 

0.73% 

P 

^relations 

Common- 

Topic 

Affiliation 

(reverse) 

ACitation 

(reverse) 

APublication 

(reverse) 

0 

One 

0.00% 

0.00% 

0.00% 

0.00% 

0.25 

One 

0.00% 

0.00% 

0.05% 

0.22% 

0.5 

One 

0.00% 

0.00% 

0.06% 

0.49% 

0.75 

One 

0.00% 

0.00% 

-0.12% 

0.84% 

1 

One 

0.00% 

0.00% 

-6.67% 

0.82% 

SimRank. 

Balanced  model  with  multi-network  outperforms  baseline  model. 

The  improvements  that  are  achieved  by  the  balanced  model  over  the  base¬ 
line  are  presented  in  Table  5.  For  the  balanced  models  with  p  =  0,  0.25,  0.75, 
significant  improvements  in  average  ranking  accuracies  are  observed  for  five 
of  the  eight  predicted  relations  (Affiliation,  CoAuthorship,  ACitation, 
CommonTopic,  reverse  Affiliation)  and  small  improvements  are  observed 
for  two  other  relations  (reverse  ACitation,  reverse  APublication).  There 
is  little  difference  in  average  ranking  accuracy  for  APublication.  We  also 
note  that  models  with  less  extreme  values  of  p  in  generally  performed  better 
than  those  with  p  =  0,1.  The  low  standard  deviations  in  Table  6  indicate 
that  our  observations  are  consistent.  (Link  prediction  on  CommonTopic  has 
an  inherently  higher  variance  due  to  smaller  number  of  Journals.) 

Balanced  model  exploits  information  from  multi-network  to  im¬ 
prove  link  predication  accuracy.  Table  7  shows  the  improvement  in  av¬ 
erage  ranking  accuracy  when  the  node  similarities  are  computed  using  the 
multi- network,  versus  using  a  single-relation  network.  Results  here  mirror 
that  in  Table  5,  showing  varying  degrees  of  improvement,  from  no  differ¬ 
ence  (APublication)  to  large  improvements  (reverse  Affiliation).  This 
suggests  that  (1)  information  encoded  in  other  relations  is  useful  for  link 
prediction;  and  (2)  our  balanced  method  for  computing  multi-network  link 
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Table  4:  Standard  Deviation  of  Improvement  in  Accuracy  (with  Single  Re¬ 
lation)  Over  Baseline 


p 

^relations 

CoAuthor- 

ship 

Affiliation 

ACitation 

APublication 

0 

One 

0.00% 

0.00% 

0.00% 

0.25 

One 

0.03% 

0.00% 

0.03% 

0.12% 

0.5 

One 

0.03% 

0.00% 

0.08% 

0.21% 

0.75 

One 

0.03% 

0.00% 

0.26% 

0.36% 

1 

One 

0.07% 

3.29% 

0.55% 

P 

^relations 

Common- 

Topic 

Affiliation 

(reverse) 

ACitation 

(reverse) 

APublication 

(reverse) 

0 

One 

0.00% 

0.00% 

0.00% 

0.00% 

0.25 

One 

0.01% 

0.00% 

0.04% 

0.24% 

0.5 

One 

0.01% 

0.00% 

0.05% 

0.55% 

0.75 

One 

0.01% 

0.00% 

0.15% 

0.94% 

1 

One 

0.01% 

0.01% 

3.63% 

2.09% 

Table  5:  Improvement  in  Average  Accuracy  (with  Multi-Network)  Over 
Baseline 


P 

^relations 

CoAuthor- 

ship 

Affiliation 

ACitation 

APublication 

0 

All 

17.23% 

4.52% 

11.84% 

-0.96% 

0.25 

All 

17.93% 

16.73% 

12.47% 

-0.16% 

0.5 

All 

18.23% 

16.98% 

12.30% 

0.25% 

0.75 

All 

18.24% 

11.40% 

0.59% 

1 

All 

-1.66% 

0.81% 

P 

^relations 

Common- 

Topic 

Affiliation 

(reverse) 

ACitation 

(reverse) 

APublication 

(reverse) 

0 

All 

10.05% 

23.53% 

1.13% 

5.50% 

0.25 

All 

10.91% 

24.99% 

1.57% 

6.55% 

0.5 

All 

11.82% 

25.21% 

1.72% 

7.19% 

0.75 

All 

13.19% 

25.24% 

1.64% 

7.63% 

1 

All 

14.40% 

16.70% 

-6.10% 

5.45% 
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Table  6:  Standard  Deviation  of  Improvement  in  Accuracy  (with  Multi- 
Network)  Over  Baseline 


p 

^relations 

CoAuthor- 

ship 

Affiliation 

ACitation 

APublication 

0 

All 

1.94% 

1.31% 

2.19% 

0.40% 

0.25 

All 

2.08% 

1.93% 

2.27% 

0.43% 

0.5 

All 

2.15% 

1.89% 

2.12% 

0.35% 

0.75 

All 

2.27% 

1.81% 

1.93% 

0.32% 

1 

All 

3.43% 

1.25% 

3.87% 

0.49% 

P 

^relations 

Common- 

Topic 

Affiliation 

(reverse) 

ACitation 

(reverse) 

APublication 

(reverse) 

0 

All 

8.18% 

2.36% 

0.79% 

3.35% 

0.25 

All 

8.20% 

1.80% 

1.04% 

3.56% 

0.5 

All 

8.39% 

1.75% 

1.12% 

3.64% 

0.75 

All 

9.29% 

1.73% 

1.23% 

3.72% 

1 

All 

9.66% 

1.91% 

3.89% 

3.94% 

similarity  is  able  to  exploit  such  information. 

Number  of  iterations  to  achieve  convergence  of  node  similar¬ 
ity  computation  increases  with  p.  This  trend  is  shown  in  Figure  2. 
As  p  increases,  greater  emphasis  is  placed  on  similarity  from  multi-hop  re¬ 
lationships  versus  immediate  common  neighbors.  Thus,  with  larger  val¬ 
ues  of  p,  more  iterations  are  required  to  achieve  convergence  (defined  by 

(vP,  <  io-5). 

Furthermore,  we  note  that  convergence  is  quickly  achieved  in  less  than 
10  iterations  for  p  <  1.  In  addition,  we  observed  epirically  that  convergence 
is  always  achieved  for  all  values  of  p,  for  all  networks,  across  all  years. 

5  Related  Work 

The  problem  of  link  prediction  has  gathered  increasing  attention  in  the  past 
decade.  In  this  section,  we  discuss  some  related  work  and  where  appropriate, 
make  comparisons  with  our  approach. 

5.1  Link  prediction  using  node  similarities 

Liben-Nowell  &  Kleinberg  [14]  presented  a  survey  of  graph  proximity  or 
“similarity”  measures.  Each  such  measure  assigned  a  connection  weight 
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Table  7:  Improvement  in  Average  Accuracy  with  Multi-Network  over  Single- 
Relation  Network 


p 

^relations 

CoAuthor¬ 

ship 

Affiliation 

ACitation 

APublication 

0 

From 

17.23% 

4.52% 

11.84% 

-0.96% 

0.25 

single- 

17.91% 

16.73% 

12.43% 

-0.36% 

0.5 

relation  to 

18.20% 

16.98% 

12.28% 

-0.19% 

0.75 

multi- 

18.21% 

17.09% 

11.60% 

-0.11% 

1 

network 

9.20% 

15.64% 

4.03% 

0.07% 

P 

^relations 

Common- 

Topic 

Affiliation 

(reverse) 

ACitation 

(reverse) 

APublication 

(reverse) 

0 

From 

10.05% 

23.53% 

1.13% 

5.50% 

0.25 

single- 

10.91% 

24.99% 

1.52% 

6.33% 

0.5 

relation  to 

11.82% 

25.21% 

1.65% 

6.70% 

0.75 

multi- 

13.19% 

25.24% 

1.76% 

6.79% 

1 

network 

14.39% 

16.70% 

0.57% 

4.63% 

Average  Number  of  Iterations 


-Multi-Network 

-Affiliation 

-CoAuthorship 

-Acitation 

Apublication 

-CommonTopic 


Figure  2:  Average  number  of  iterations  for  convergence  of  balanced  model 


score(.x,y)  to  every  pair  of  (homogeneous)  nodes  ( x,y ).  By  making  the 
homophily  [12]  assumption  that  nodes  that  are  similar  are  more  likely  to 
associate  with  each  other,  the  authors  are  able  to  generate  a  ranking  of  like- 
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lihood  of  adjacency  formation  based  on  the  pair-wise  node  similarity  values. 
Unfortunately,  the  link  prediction  is  restricted  to  only  relations  between 
nodes  of  the  same  type.  In  this  report,  we  adopt  the  general  framework  of 
Liben-Nowell  &  Kleinberg  in  applying  similarity  measures  to  the  task  of  link 
prediction.  However,  we  consider  our  technique  as  an  improvement  in  two 
major  ways: 

•  Firstly,  while  the  class  of  similarity  measures  in  Liben-Nowell  &  Klein¬ 
berg  were  applied  to  uni-modal  uni-relation  co-authorship  networks, 
we  demonstrate  a  multi-network  similarity  measure  in  this  paper. 

•  Secondly,  our  link  prediction  technique  is  vastly  different  from  the 
straightforward  similarity-based  ranking  of  Liben-Nowell.  We  assume 
link  preference  (a  node  is  likely  to  form  links  with  another  node  which 
is  similar  to  the  nodes  that  it  is  already  linked  with)  instead  of  ho- 
mophily  (a  node  is  likely  to  form  link  with  another  node  which  is 
similar  to  itself).  We  also  point  out  that  the  similarity-based  ranking 
is  not  extensible  to  the  multi-modal  setting,  whereas  our  link  predic¬ 
tion  technique  is  easily  applied  to  predicting  relations  across  two  types 
of  nodes. 

Of  the  similarity  measures  covered  in  [14],  common  neighbors  and  Sim- 
Rank  [7]  bear  the  greatest  resemblance  to  our  method.  The  intuition  behind 
SimRank  is  that  similar  objects  are  related  to  similar  objects.  This  resonates 
with  our  notion  of  neighborhood  similarity,  although  there  are  difference  in 
the  exact  implementation  details  (such  as  the  form  of  normalization).  Our 
similarity  measure  can  thus  be  seen  as  a  combination  of  the  SimRank  and 
common  neighbors  models,  but  further  extended  to  the  multi-network  set¬ 
ting.  It  is  interesting  to  note  that  the  link  prediction  results  presented  in  this 
paper  are  consistent  with  those  in  Liben-Nowell  &  Kleinberg  [14]:  a  com¬ 
mon  neighbors  predictor  (p  =  0)  tends  to  perform  better  than  one  based  on 
neighborhood  similarity  (p  =  1)  or  SimRank.  Nevertheless,  the  best  results 
are  obtained  by  the  balanced  models  with  p  =  0.25,0.5,0.75. 

5.2  Statistical  Relational  Learning 

A  class  of  popular  approach  for  modeling  relational  data  is  Statistical  Re¬ 
lational  Learning  (SRL).  A  survey  of  such  methods  is  provided  in  Getoor 
[4,  5].  We  provide  a  brief  description  of  a  few  SRL  models  below;  for  a 
in-depth  treatment  of  the  topic,  please  refer  to  Getoor  &;  Taskar  [6] . 

Some  of  the  earlier  works  in  this  field  were  by  Popescul  et.  al.  [16,  15,  17], 
who  proposed  Structural  Logistic  Regression  for  link  analysis.  Structural 
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Logistic  Regression  couples  two  main  processes:  (1)  generation  of  features 
from  relational  data  and  (2)  their  selection  with  statistical  model  selection 
criteria.  Thus,  Structural  Logistic  Regression  combines  concepts  from  in¬ 
ductive  logic  programming  (for  generation  of  features)  and  machine  learning 
(for  selection  of  features).  The  two  processes  are  executed  iteratively,  so  that 
generated  features  are  selected  by  a  statistical  model  selection,  and  in  turn 
selected  features  are  used  to  generate  new  features  from  the  relational  data. 
In  [16,  15,  17],  the  binary  logistic  regression  model  was  the  statistical  model 
of  choice,  although  the  authors  point  out  that  it  may  be  possible  to  use 
other  multi-class  statistical  classifiers  as  well. 

Other  SRL  approaches  like  Probabilistic  Relational  Models  (PRM)  [3], 
Relational  Markov  Networks  (RMN)  [23],  etc.  define  probabilistic  mod¬ 
els  over  the  relational  data.  A  PRM  is  in  essence  a  directed  probabilistic 
graphical  model,  where  a  random  variable  in  the  PRM  corresponds  to  an 
attribute  of  an  entity  or  potential  adjacency  in  the  network.  Reference  and 
existence  of  potential  adjacencies  can  be  modeled  as  attributes  as  well.  The 
probability  distribution  over  a  random  variable  is  then  dependent  on  the 
other  attributes  of  the  entity  or  adjacency,  and  possibly  on  the  attributes  of 
related  entities  and  adjacencies  too.  Importantly,  the  parameters  of  proba¬ 
bility  distributions  are  shared  between  attributes  of  the  same  type. 

RMNs  are  the  undirected  analogy  to  PRMs.  Cliques  are  induced  on 
the  set  of  entities/adjacencies  and  their  attributes  by  the  use  of  clique  tem¬ 
plates.  Each  clique  template  performs  a  kind  of  SQL-style  query  on  the  enti¬ 
ties/adjacencies  by  selecting  the  appropriate  attributes  of  entities  which  are 
related  in  the  specified  way.  Parameter  sharing  between  cliques  is  acheived 
by  defining  potential  on  clique  templates  rather  than  individual  cliques. 

Domingos  &  Richardson  [2]  describe  Markov  logic ,  a  unifying  framework 
for  SRL  methods  that  combines  undirected  probabilistic  graphical  models 
(Markov  networks)  and  first-order  logic.  Syntactically,  Markov  logic  aug¬ 
ments  first-order  logic  with  a  weight  for  every  formula.  Semantically,  a  set 
of  Markov  logic  formulae  represents  a  probability  distribution  over  possible 
worlds,  in  the  form  of  a  log-linear  model  with  one  feature  per  grounding  of 
a  formula  in  the  set,  with  the  corresponding  weight.  Said  differently,  given 
a  set  of  constants  representing  the  entities  in  the  world,  a  Markov  network 
is  then  induced,  such  that  cliques  in  the  Markov  network  correspond  to  the 
Markov  logic  formulas,  with  log- linear  potentials  with  the  corresponding 
weights.  Domingos  &  Richardson  also  show  how  other  SRL  approaches,  in¬ 
cluding  Structural  Logistic  Regression,  PRMs  and  RMNs,  map  into  Markov 
logic  models. 

More  recently,  Xu  et.  al.  proposed  the  use  of  Infinite  Hidden  Relational 


25 


models  (IHRM)  [29]  and  Multi-Relational  Gaussian  Processes  (MRGP)  [28] 
specifically  for  multi-relational  learning.  An  IHRM  introduces  a  random 
variable  for  each  potential  link,  and  also  an  additional  hidden  random  vari¬ 
able  for  every  entity.  The  hidden  random  variable  can  be  seen  as  a  hidden 
attribute  specifying  the  cluster  to  which  the  entity  belongs,  and  is  assume 
to  determine  the  attributes  of  the  entity.  Links  are  also  assumed  to  depend 
only  on  the  hidden  random  variables  of  the  two  entities  involved.  The  num¬ 
ber  of  clusters  is  allowed  to  be  infinitely  large  by  using  a  Dirichlet  process 
mixture  model. 

Like  IHRMs,  MRGPs  also  have  a  latent  variable  for  each  entity,  en¬ 
coding  the  essential  property  of  the  entity.  In  additional,  another  latent 
variable  is  introduced  for  each  entity  and  relation  that  it  can  be  involved 
in,  representing  the  hidden  causes  for  the  entity  to  be  involved  in  the  rela¬ 
tion.  The  MRGP  differs  from  the  IHRM  in  that  the  latent/hidden  variables 
are  outputs  from  Gaussian  processes.  The  likelihood  of  a  link  formation  is 
then  dependent  on  the  essential  properties  and  hidden  causes  of  the  entities 
involved  in  the  relation. 

Our  proposed  solution  for  multi-network  link  prediction  clearly  differs 
from  such  statistical  modeling  of  relational  data.  We  do  not  deny  the  ex¬ 
pressive  power  of  SRL  models,  but  nevertheless  point  out  some  its  potential 
difficulties  and  shortcomings  in  link  prediction.  As  noted  in  [4],  the  typically 
small  prior  probability  of  a  link  causes  difficulty  for  building  statistical  mod¬ 
els  for  link  prediction.  More  importantly,  exact  inference  using  probabilistic 
models  tends  to  be  computationally  expensive,  and  in  most  situations  only 
approximate  inference  is  possible. 

Furthermore,  the  typical  problem  setting  in  the  SRL  papers  differs  from 
our  temporal  link  prediction  setting.  In  the  SRL  papers,  it  is  often  assumed 
that  links  are  only  partially  observed;  the  problem  is  then  to  resolve  the 
uncertainty  of  unobserved  potential  links.  On  the  other  hand,  in  this  report, 
our  interest  lies  in  predicting  formation  of  potential  links  at  a  later  time  ti+1, 
given  that  we  have  observed  links  at  an  earlier  time  t.  While  we  imagine 
that  it  is  possible  to  extend  SRL  models  to  incorporate  a  temporal  aspect, 
the  temporal  link  prediction  problem  is  nonetheless  not  pursued  in  the  SRL 
papers.  Conversely,  we  do  not  pursue  the  problem  of  unobserved  links  in 
this  report. 

5.3  Matrix  factorization 

Another  class  of  approaches  to  multi-relational  modeling  is  matrix  factor¬ 
ization.  In  these  approaches,  the  adjacency  or  attribute  matrices  are  de- 
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composed  into  low  rank  factors.  It  is  necessary  to  determine  beforehand  the 
ranks  (or  numbers  of  clusters)  of  the  factorization.  In  [11],  Long  et.  al.  pro¬ 
pose  a  Collective  Factorization  on  Related  Matrices  for  the  purpose  of  clus¬ 
tering.  Specifically,  to  cluster  each  mode  Xp  into  cp  clusters,  the  form  of  fac¬ 
torization  employed  is  Ap^q  «  CpMp-^C^  ,  where  Cp  £  {0,  l}nrxcp  is  the 
cluster  indicator  matrix  such  that  C p^q(i,j)  =  1  and  Cp^q(i.  j)  =  1 

denotes  that  xpp  is  associated  with  the  jth  cluster.  Mp_>g  is  the  cluster  asso¬ 
ciation  matrix  such  that  j)  denotes  the  association  between  the  ith 

cluster  of  mode  Xp  and  the  jth  cluster  of  Xq.  An  approximate  factorization 
is  then  achieved  by  minimizing  a  loss  function  comprised  of  the  Frobenius 
norms  of  Ap^q  —  CpMp^9C^  of  every  relation. 

Tang  et.  al.  [22,  21]  then  extend  this  model  by  including  a  temporal 
aspect,  and  imposing  additional  loss  from  changes  in  the  cluster  membership 
over  time.  The  temporal  collective  factorization  model  is  then  applied  to 
community  detection  in  dynamic  multi-mode  networks. 

Singh  &  Gordon  [19]  suggest  an  alternative  form  of  factorization:  Ap^q  « 
/(LpL^),  where  Lp  £  R"pxc  is  the  low  rank  factors  for  mode  Xp.  f  : 
jp>npxn,j  ^  ]p>  np  x ng  ^  ancj  c  9S  ^jie  rank  Qf  factorization.  Singh  8z  Gordon  pro¬ 
pose  an  iterative  Newton-Rapshon  solution  based  on  minimizing  the  Breg- 
man  divergences  between  the  model  and  the  relation  matrices.  Lippert  et. 
al.  [10]  adopt  a  similar  factorization  as  Singh  &  Gordon,  but  minimize  an 
alternative  objective  through  gradient  descent.  Both  Singh  &  Gordon  and 
Lippert  et.  al.  apply  their  collective  matrix  factorizations  to  the  task  of 
predicting  unobserved  links. 

5.4  Other  domains 

To  the  best  of  our  knowledge,  Li  et.  al.  [9]  come  closest  to  our  approach  of 
modeling  multi- network  node  similarities.  Li  et.  al.  are  primarily  concerned 
with  the  different  problem  of  content-based  image  retrieval  (CBIR),  and  to 
that  end,  propose  four  inter-dependent  similarity  matrices:  Sb  for  blobs,  Sw 
for  words,  STb  for  images  calculated  on  their  constituent  blobs,  and  STw, 
for  images  calculated  on  their  constituent  words.  A  iterative  estimation 
of  the  similarity  matrices,  similar  to  our  approach,  is  proposed.  In  the 
proposed  types  of  iteractions  between  similarity  matrices,  each  similarity 
matrix  is  influenced  by  at  most  one  other  similarity  matrix. 
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6  Conclusion  and  Future  Directions 


We  have  presented  a  method  for  computing  multi-network  node  similarities, 
and  discussed  how  link  prediction  can  be  performed  by  adjacency  propaga¬ 
tion  through  the  similarity  matrices.  Our  experiment  demonstrates  that  (1) 
information  encoded  in  other  relations  is  useful  for  link  prediction;  and  (2) 
our  balanced  method  for  computing  multi-network  link  similarity  is  able  to 
exploit  such  information,  improving  link  prediction  by  up  to  25%. 

In  this  report,  we  have  validated  our  model  on  one  dataset,  albeit  on 
multiple  different  types  of  relations.  It  would  be  interesting  to  apply  the 
model  to  other  datasets  of  different  nature,  and  to  analyse  any  differences 
in  performance. 

Furthermore,  the  node  similarity  values  may  potentially  be  applied  to 
network  analysis  problems  other  than  link  prediction.  For  instance,  Mul¬ 
tidimensional  scaling  may  be  applied  to  the  node  similarity  matrices  for 
clustering  and  community  derivation. 

We  have  thus  far  been  focusing  on  exploiting  structural  information  for 
node  similarity  estimation  and  link  prediction.  It  may  be  possible  to  further 
improve  link  prediction  accuracy  by  incorportating  other  non-topological 
information,  e.g.  entity  attributes,  into  the  computation  of  node  similarities. 

Finally,  other  possible  future  directions  of  research  are  to  explore  using 
different  parameters  pp^q  for  each  relation  instead  of  a  single  parameter  p, 
and  to  automatically  learn  the  optimal  p  or  pp- >q  values  from  data. 
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1  Overview 

In  this  project,  we  investigate  characterization  and  measurement  of  inter¬ 
action  behaviors  in  information  exchange  networks  based  on  user-generated 
interaction  data.  We  will  focus  on  multimodal  information  exchange  net¬ 
works  which  involve  actors  sending  information  to  one  another.  Examples 
of  such  networks  include  email,  messaging,  and  blog  networks. 

We  focus  on  modeling  engagingness1  and  responsiveness  behaviors  in 
email  networks  and  messaging  networks.  We  have  used  Enron  Email  data 
and  MyGamma  Social  Network  Message  data  as  the  target  datasets.  The 
former  is  so  far  the  only  known  publicly  available  information  exchange  data 
with  messages  assigned  with  specific  senders  and  recipients.  Email  data 
preprocessing  and  thread  assembly  were  conducted  on  the  dataset.  We  also 
introduced  several  engagingness  and  responsiveness  models,  and  proposed 
to  use  them  as  features  in  solving  the  email  reply  order  prediction  task. 

The  MyGamma  social  network  message  dataset  is  from  a  proprietary 
mobile  social  networking  site  known  as  myGamma.  MyGamma  is  owned  by 
BuzzCity  Pte  Ltd,  a  Singapore  company.  This  dataset  offers  both  messaging 
and  friendship  network  data  for  our  research.  We  have  adapted  our  proposed 
behavior  models  and  developed  new  ones  for  this  myGamma  dataset. 

1In  our  previous  documents,  the  term  “activeness”  was  used.  Subsequently,  we  adopt 
“engagingness”  as  a  more  appropriate  term. 
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2  Behavior  Modeling  in  Enron  Email  Network 


The  following  summarizes  the  important  research  contributions  of  our  work 
in  behavior  modeling  for  email  networks: 

•  We  define  four  categories  of  models  for  engagingness  and  responsive¬ 
ness  behaviors  prevalent  in  email  networks.  They  are  (a)  email  based, 
(b)  email  thread  based,  (c)  email  sequence  based,  and  (d)  social  cogni¬ 
tive  model  categories.  For  each  model  category,  one  can  define  different 
behavior  models  based  on  different  email  attributes.  To  the  best  of 
our  knowledge,  this  is  the  first  time  engagingness  and  responsiveness 
behavior  models  are  studied  systematically. 

•  We  apply  our  proposed  behavior  models  on  the  Enron  email  network, 
analyze  and  compare  the  proposed  behavioral  models.  We  conduct 
data  preprocessing  on  the  email  data  and  establish  links  between 
emails  and  their  replies.  In  our  empirical  study,  we  found  engagingness 
and  responsiveness  are  distinct  from  each  other.  Most  engagingness 
(responsiveness)  models  of  users  are  shown  to  be  consistent  with  each 
other. 

•  We  introduce  email  reply  order  prediction  as  a  novel  task  that  uses  en¬ 
gagingness,  responsiveness  and  other  email  features  as  input  features. 
An  SVM  classifier  is  then  learnt  from  the  features  of  training  email 
pairs  and  applied  to  test  email  pairs.  According  to  our  experimen¬ 
tal  results,  the  accuracy  of  our  SVM  classifier  is  about  77%  which  is 
50%  better  than  random  guess.  This  indicates  that  user  behaviors  are 
useful  in  the  prediction  task. 

2.1  Engagingness  and  Responsiveness  Behavior  Models 

In  this  section,  we  describe  our  proposed  behavior  models  for  user  engag¬ 
ingness  and  responsiveness.  All  the  models  assume  that  emails  have  been 
preprocessed  with  duplicate  elimination  and  email  reply  relationship  identi¬ 
fication.  We  divide  our  models  into  the  following  categories: 

•  Email  based  models:  These  models  consider  emails  as  the  basic  data 
units  for  measuring  user  behaviors.  Email  attributes  such  as  sender, 
recipient  list,  date,  etc.,  are  used. 

•  Email  thread  based  models:  These  models  consider  email  threads 
as  the  basic  data  units  for  measuring  user  behaviors.  The  models 
therefore  use  attributes  of  email  thread  to  quantify  behaviors. 
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Table  1:  Notations. 


S{ui ) 

Emails  sent  by  user  Ui 

R{ui) 

Emails  received  by  Ui 

RB(ui) 

Email  replies  sent  by  Uj 

RT(m) 

Emails  replying  to  Ui  s  earlier  emails 

TH(iii) 

Threads  started  by  an  email  sent  by  Ui 

r(e) 

Reply  to  email  e 

Sdr(e) 

Sender  of  email  e 

Rcp(e) 

Recipients  (in  both  To  and  Cc  lists)  of  email  e 

t(e) 

Sent  time  of  email  e 

E{ui  Uj) 

Emails  from  Ui  to  Uj 

E{ui  Uj) 

Emails  between  Ui  and  Uj 

rt(ui  — >  Uj) 

Avg.  response  time  from  Ui  to  Uj 

rt{ui  t-k  Uj) 

Avg.  response  time  between  ut  and  Uj 

RE(ui  — >  Uj) 

Reply  emails  from  Ui  to  Uj 

RE(ui  Uj) 

Reply  emails  between  Ui  and  Uj 

Email  Based  Email  Thread  Email  Sequence  Social  Cognitive 

Models  Based  Models  Based  Models  Models 


.^1^. 

uaocu  iviuucio 

IVIUUCIO 

Email  Count 

Email 

Email  Reply 

Thread 

Reply  Gap 

Random  Walk 

(EC) 

Recipient 

(ER) 

Time 

(ET) 

Count 

(TC) 

(RG) 

(RW) 

Figure  1:  Taxonomy  of  Models 


•  Email  sequence  based  models:  These  models  examine  the  sequence 
of  emails  received  and  replied  by  each  user  and  derive  the  user  behav¬ 
iors  from  the  gaps  between  emails  received  and  their  replies. 

•  Social  cognitive  models:  These  models  consider  social  perception 
of  user  behaviors  within  the  email  network  and  measure  behaviors 
accordingly. 

Figure  1  shows  the  taxonomy  of  behavior  models  in  the  above  categories 
to  be  further  defined  in  the  following  sections.  Each  model  ( M )  consists 
of  a  pair  of  engagingness  ( AM )  and  responsive  ( RM )  score  formulas  defined 
based  on  some  principles.  The  AM  and  RM  score  values  are  in  [0,1]  range 
with  0  and  1  representing  the  lowest  and  highest  values  respectively.  Table  1 
shows  a  list  of  symbols  and  their  meanings  that  we  use  in  this  report. 
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2.1.1  Email  Based  Models 


Email  Count  Model  (EC) 

The  email  count  model  is  defined  based  on  the  principle  that  an  engaging 
user  should  have  most  of  his/her  emails  replied,  while  a  responsive  user 
should  have  most  of  his/her  received  emails  replied.  The  engagingness  and 
responsiveness  formulas  are  thus  defined  by: 


AEC(ui) 

REC(ui) 


\RT(Ui)\ 

|S(i*i)l 

\RB(Ui)\ 

\R(ui)\ 


(1) 

(2) 


For  users  with  empty  S(ui )  (or  R(ui)),  AEC (m)  (or  REC(ui ))  is  assigned  a 
zero  value. 

Email  Recipient  Model  (ER) 

The  intuition  of  this  model  is  that  an  email  with  many  recipients  is  likely 
to  expect  very  few  replies.  Hence,  an  engaging  user  is  one  who  gets  replies 
from  many  recipients  of  his /her  emails  while  an  non-engaging  user  receives 
very  few  or  no  reply  even  when  his/her  emails  are  sent  to  many  recipients. 
On  the  other  hand,  a  responsive  user  is  one  who  replies  emails  regardless 
of  the  number  of  recipients  in  the  emails.  A  non-responsive  user  is  one 
who  does  not  reply  even  if  the  emails  are  directed  to  him/her  only.  The 
engagingness  and  responsiveness  formulas  are  thus  defined  by: 


AER(Ui ) 


1  \{uj  e  Rcp(e)  A  r(e)  £  RB(uj)}\ 

|S(«i)|  ZV  ^  \Rcp{e)\ 

eeS(Ui) 


(3) 


t>er(  s  =  1  I  Rcp(e)\ 

K  \R(Ui)\  ^  MaxRcpCnt 

e£RB(ui)  s.t. 

3uj,3e"ES(uj),r(e")=e 


(4) 


where  MaxRcpCnt  denotes  the  largest  recipient  count  among  all  Enron 
emails. 

Email  Reply  Time  Model  (ET) 

The  reply  time  of  an  email  can  be  an  indicator  of  user  engagingness  and 
responsiveness.  The  email  reply  time  model  adopts  the  principle  that  en¬ 
gaging  users  receives  the  reply  emails  sooner  than  non-engaging  users,  while 
responsive  users  reply  to  the  received  emails  quicker  than  non-responsive 
users. 

Given  an  email  e'  which  is  a  reply  of  email  e,  e'  =  r(e),  the  reply  time  of 
e/  RT[e')  =  t(e')  —  t(e).  The  z-normalized  reply  time  RT[e!)  is  defined  by 
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(a)  engagingness  of  m  (b)  Responsiveness  of  Ui 
Figure  2:  Examples 


^  where  RT  and  cjrt  are  the  mean  and  standard  deviation  of  reply 
time  respectively.  Now,  we  define  the  engagingness  and  responsiveness  of 
ET  model  as: 

E  nme')) 

Uj  £Rcp(e), 

3e'£RB(uj),e'=r(e) 

(5) 


Aet(v)  =  1  V  1 

1  \S(Ui)\  \Rcp(e)\ 

eeS{Ui) 


fl£T(“-)  =  wb  S  n&v ))  (6) 

e'  £RB  {ui)  ,e&R(ui)  ,r(e)=e' 


where 


(7) 


The  function  /()  is  designed  to  convert  the  normalized  reply  time  to  the 
range  [0,1]  with  0  and  1  representing  extreme  slow  and  extreme  fast  reply 
times  respectively. 

Examples 

Consider  the  email  network  in  Figure  2(a).  Suppose  e'k  denote  the  reply 
to  email  e^.  The  engagingness  values  of  Ui  derived  by  the  EC  and  ER  email 

based  models  are:  (a)  AEC  =  |  =  0.6;  and  (b)  AER  =  - 2  + 3 -  =  0.58.  Sup¬ 
pose  RT(e\ )  =  5,  RT(e'2)  =  10,  and  RT(e':i)  =  20.  The  engagingness  of  Ui 
according  to  ER  model  is: 


aet  I  •  (/(5))  +  |  •  cmo)  +  /(20)) 


0.45 
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Figure  3:  Email  thread  example. 
Table  2:  Distribution  of  emails  per  thread. 


#  emails 

2 

3 

4 

5 

6 

>  7 

Total 

#  threads 

11,302 

3,925 

1,614 

732 

404 

616 

18,593 

Consider  the  email  network  in  Figure  2(b).  The  responsiveness  values 
of  Ui  derived  by  EC  and  ER  models  are:  (a)  REC  =  |  =  1;  and  (b)  Rer  = 

=  0.38. 

2.1.2  Email  Thread  Based  Model 

Here,  we  define  the  thread  count  model  (TC)  as  an  email  thread  based 
model.  In  the  email  count  model,  engagingness  is  measured  by  emails  sent 
by  a  sender  and  sent  emails  directly  replied  by  some  recipient (s).  However, 
direct  reply  is  not  the  only  type  of  response  to  an  email.  Email  may  be 
indirectly  replied  in  email  threads  due  to  forwarded  emails.  For  example, 
as  illustrated  in  Figure  3,  a  user  u\  advertises  a  job  position  by  sending  an 
email  to  u§  who  subsequently  forwards  the  email  to  his  student  u 3.  If  113 
replies  to  u\ ,  we  say  that  the  original  email  is  replied  indirectly  in  an  email 
thread. 

Email  thread  is  defined  by  a  tree  of  emails  connected  by  reply  and  for¬ 
ward  relationships.  Table  2  shows  the  distribution  of  threads  by  the  number 
of  emails  per  thread.  As  we  can  notice,  the  distribution  follows  Zipf’s  law. 
Majority  of  threads  (11,302)  contain  only  two  emails.  There  are  3925  threads 
that  include  three  emails.  The  largest  thread  contains  37  emails. 

Based  on  email  threads,  the  thread  count  model  includes  indirect  replies 
to  emails  forwarded  between  users  using  the  principle:  the  user  is  highly 
engaging  if  he  or  she  receives  many  of  his/her  emails  replied  directly  or 
indirectly  by  recipients,  and  is  highly  responsive  if  he  or  she  replies  or  for¬ 
wards  most  emails  earlier  received.  In  the  following,  the  engagingness  and 


37 


responsiveness  of  a  user  u%  are  defined  as: 


1 


Alc(in)  =  '  •  \{e  G  G  TH(m),  Be' ,  e  -»  e'  A«,6  Rcp(e')} \ 

P(Ui)|  t 


(8) 


RTC(ui)  =  — — -  •  \{e  G  R(ui)\3uj,e',t  G  TH(uj),e-»  e'  Auj  G  Rcp(e')} \ 

\K[Ui)\  t 


(9) 


where  e  e;  returns  TRUE  when  e  is  directly  or  indirectly  connected  to  e' 
in  the  thread  t ,  and  FALSE  otherwise. 

2.1.3  Email  Sequence  Based  Model 

Email  sequence  refers  to  the  sequence  of  emails  sent  and  received  by  a  user 
ordered  by  time.  To  derive  engagingness  and  responsiveness  from  email 
sequences,  we  consider  the  principle  that  an  engaging  user  is  expected  to 
have  his  or  her  sent  emails  replied  soon  after  they  are  received  by  the  email 
recipients,  and  an  responsive  user  replies  soon  after  they  receive  emails.  As 
users  may  not  always  stay  online,  the  time  taken  to  reply  an  email  may  vary 
very  much.  Instead,  we  consider  the  number  of  emails  received  later  than 
an  email  e  but  are  replied  before  e  by  a  user  as  a  proxy  of  how  soon  e  is 
replied. 

The  above  principle  is  thus  used  to  develop  the  reply  gap  model  (RG). 
Let  seqi  denote  the  email  sequence  of  user  U{.  When  an  email  received  by 
Ui  is  replied  before  other  email(s)  received  earlier,  the  reply  of  the  former 
is  known  as  an  out-of-order  reply.  Formally,  for  an  email  e  received  by 
Ui,  we  define  the  number  of  emails  received  and  number  of  out-of-order 
replies  between  e  and  its  reply  e'  in  seqi,  denoted  by  nr(ui ,  e)  and  n-o{ui,  e) 
respectively,  as 


nr(ui,e) 


ff  emails  received  between 
e  and  e'  in  seqi, 

-1, 


if  Be'  G  RT(ui), 
r(e )  =  e! 
otherwise 


no(ui,e) 


ff  emails  received 
between  e  and  ef  in  seqi 
and  have  been  replied, 

.  "I- 


if  Be'  G  RT(ui), 
r(e)  =  e' 

otherwise 


(10) 


(11) 
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The  —1  value  is  assigned  to  nr  and  n„  when  e  is  not  replied  at  all.  The 
user  engagingness  and  responsiveness  of  the  RG  model  are  thus  defined  as: 


ARG(Ui )  = 


E 


e£S(iii)  v  \Rcp(e)\ 


E 


/I  no{'^j  7e)  \  \ 

Uj£Rcp(e)\1  nr(uj,e )'' 


1 5(^)1 


(12) 


E 


e&R(ui) 


/  1  »  Q  ('^  -f ■*+  ‘ 

1  nr(m,e)  ‘ 


(13) 


l^(«i)l 

For  example,  let  seg*  =  {ei,  e2,  e3,  e},  64,  e4,  e(j}  be  the  email  sequence 


Note  that  n°jtt<’eij, 

nr(Ui,e  1)  - 


of  user  U{  where  e}  =  r(efc)’ s. 

are  §,  5,  j,  and  0  respectively.  Hence,  ARG('Ui)  =  4 

The  responsiveness  of  iq  can  be  computed  in  the  same  manner. 


na  (Uj  ,e2 )  n„  3 ) 
nr(ui,e 2)’  nr(ui,e 3) 
1+1 +0+1 


and 


=  0.625. 


2.1.4  Social  Cognitive  Model 

A  social  cognitive  model  is  based  on  social  cognitive  theory  which  suggests 
that  people  learn  by  watching  what  others  do  [8].  Such  kind  of  models  thus 
measure  a  user’s  engagingness  and  responsiveness  behaviors  by  observing 
what  the  other  users  react  to  emails  sent  from  the  user  and  observe  the  email 
interaction  among  one  another.  In  this  paper,  we  introduce  a  random  walk 
(RW)  social  cognitive  model. 

For  engagingness,  each  user  Uk  perceives  a  user  m  to  be  more  engaging 
than  another  user  Uj  if  more  emails  from  m  are  replied  ahead  of  emails  from 
Uj  based  on  the  emails  in  the  mailbox  of  Uk ■  For  instance,  suppose  that  Uk 
has  an  email  sequence  seqk  =  (ei(ui,  {uk}),  e2(it2,  {uk}),e'2(uk,  {^2}),  e[(uk,  {tti})), 
where  ev(ux,  Uy)  denotes  email  ev  sent  by  ux  to  recipients  Uy  and  e’v  denotes 
the  reply  of  email  ev.  uk  receives  ei  before  e2  but  the  reply  e[  comes  after  e'2. 

This  indicates  that  Uk  considers  n2  more  important  than  u\ .  Furthermore, 

U2  is  more  engaging  than  u\  from  Uk  s  standpoint.  Based  on  the  above  ob¬ 
servation,  we  say  that  uk  observes  the  engagingness  superiority  of  rt2  over 
u\. 

Similarly  for  responsiveness,  Uk  perceives  a  user  u\  to  be  more  responsive 
than  another  user  u2  if  uk  observes  reply  emails  from  u\  earlier  than  n2  for 
the  same  emails  sent  to  both  u\  and  u2  which  can  be  from  uk  or  other  users. 

Formally,  we  represent  an  engagingness  weighted  directed  graph 
Ga  =  ( U ,  EA )  as  follows: 

•  JJ  represents  the  set  of  all  users. 
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•  Ea  consists  of  directed  edges.  When  in  the  mailbox  of  some  ,  Ui 
has  Xk  emails  replied  ahead  of  emails  from  uj ,  we  represent  this  by  a 
directed  edge  Uj  — >  Ui. 

•  The  weight  of  Uj  — >  u^,  weight(uj  — )•  Ui),  is  the  sum  of  x^s  for  all 
Uk  s.  The  larger  is  weight(uj  Ui),  the  more  users  observe  that  u,  is 
more  engaging  than  Uj. 

In  a  similar  manner,  we  can  define  a  responsiveness  weighted  di¬ 
rected  graph  Gb  =  ( U,Er ). 

The  engagingness  (or  responsiveness)  weighted  directed  graph  will  be 
further  processed  to  derive  the  degree  of  engagingness  (or  responsiveness) 
of  users.  Each  directed  graph  so  far  captures  the  perceived  relative  dif¬ 
ference  between  users  in  engagingness  (or  responsiveness).  It  however  does 
not  immediately  assign  engagingness/responsiveness  scores  to  the  users.  We 
therefore  propose  to  perform  random  walk  on  the  engagingness  (or  respon¬ 
siveness)  graph  so  as  to  determine  the  user  engagingness  (or  responsiveness) 
values  as  the  stationary  probabilities  of  visiting  them. 

The  random  walk  process  on  the  engagingness  graph  to  obtain  the  en¬ 
gagingness  of  users  denoted  by  ARW (lifc)’s  consists  of  the  following  steps: 

1.  Determine  the  largest  node  aggregated  edge  weight,  MaxW eight  = 
MaxUj{J2Ui  weight(uj  ->•  m )} 

2.  For  each  user  Uj , 

(a)  surrij  =  0 

(b)  For  each  edge  uj  — >  m , 

i.  Assign  a  transition  probability  to  Uj  — >•  Ui  as  p(uj,Ui)  = 

weight(uj^Ui) 

MaxW  eight 

ii.  surrij  =  sum.j  +p(uj,ui ) 

(c)  / /  assign  to  the  remaining  weights  to  all  users. 

Create  an  edge  uj  ut  for  all  ut  with  p(uj,ut)  =  — my11  if 

Uj  -*  ut  does  not  exist; 

Assign  p(uj,ut)+  =  — j^rp1  otherwise 

3.  For  each  user  ut ,  initialize  AR^(ui)  randomly 

4.  Repeat  the  following  steps: 

(a)  For  each  ut,  ARA  ( ut )  =  AR^,(ui) 
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(a)  engagingness  weighted  directed  graph 


0.19 


(b)  engagingness  graph  for  random  walk 
Figure  4:  Social  cognitive  model. 
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(b)  For  each  Ui,  A^(Ui)  =  EUj->Ui  P(UJ’  ui)  '  A RW(uj) 

5.  Until  \ARW  {uf)  —  A™%j(ui)\  <  e  2  for  all  uf  s 

To  illustrate  the  above  algorithm,  consider  the  example  in  Figure  4.  u-2  is 
more  engaging  than  u\,  with  weight(u i  — >  112)  =  0.9.  On  the  other  hand,  u\ 
is  more  engaging  than  U2  with  weight(u2  ui)  =  0.4.  In  Figure  4  (a),  the 
total  engagingness  weight  of  u\  to  all  nodes  112  and  U3  in  the  engagingness 
weighted  directed  graph  is  weight (m)  =  weight(u\  — >  U2)  +  weight(u\  — > 
113)  =  1.4.  In  the  same  way,  the  engagingness  weight  of  112  and  U3  are  0.6 
and  0.6,  respectively.  Then,  the  weight  value  of  each  link  is  normalized  by 
the  maximum  weight  value,  MaxW  =  weight{u\).  E.g.,  weight(v,2  — >  v.3)  = 
wer9M^J\yUi'>  =  TT-  -*701,  n°des  with  total  weight  <  1,  the  unused  weight  will 
be  used  to  create  links  with  equal  weights  to  all  the  nodes.  E.g.,  for  «2 ,  it 
has  unused  weight  of  As  a  result  of  the  new 

links  for  the  unused  weight,  weight{u2  — >  113)  =  +  ^1'41  4°'6^  •  ^=0.33.  In 

this  process,  the  engagingness  graph  is  row-stochastic  because  its  rows  are 
nonnegative  and  the  sum  of  each  row  is  one.  This  stochastic  matrix  can  be 
viewed  as  a  transition  matrix  associated  to  a  family  of  Markov  chains,  where 
each  entry  ( m,Uj )  represents  the  probability  of  a  transition  from  state  ut  to 
state  Uj. 

2.2  Email  Reply  Order  Prediction 

We  now  consider  the  email  reply  order  prediction  which  has  the  following 
setup.  Given  a  pair  of  emails  (e*,  ej)  sent  to  the  same  user  from  users  tq  and 
Uj  respectively,  we  want  to  determine  the  order  in  which  the  two  emails  will 
be  replied.  Here,  we  assume  that  both  e*  and  ej  require  some  replies  and 
and  Uj  are  not  the  same  person.  The  outcome  of  prediction  is  either  e*  or 
ej  first. 

Our  proposed  method  is  to  train  a  Support  Vector  Machine  (SVM)  clas¬ 
sifier  using  labeled  email  pairs,  and  to  apply  the  trained  classifier  on  unseen 
email  pairs.  For  each  email  pair,  we  can  derive  features  directly  from  the 
emails  themselves  and  their  senders  including  the  previous  emails  they  have 
sent  and  received.  There  are  three  types  of  features  used,  namely:  (a)  com¬ 
parative  email  features  (E),  (b)  comparative  interaction  features  (I)  and  (c) 
comparative  behavior  features  (B). 

2In  our  experiment,  we  used  e  =  .0000001  and  numbers  of  iterations  required  to  com¬ 
pute  ARW  and  Rml  are  8  and  12  respectively. 
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Table  3:  Email  Features  E. 


No 

Description 

No 

Description 

1 

f(e) 

9 

l<S(Sdr(e))| 

2 

size(e) 

10 

\R(Sdr(e))\ 

3 

size(r(e ))  (assuming  we 
can  determine  the  reply) 

11 

Avg.  \S(Sdr(e))\  per  day 

12 

Avg.  i?(5dr(e))  per  day 

4 

size(e )  +  size(r(e)) 

13 

|i?B(5dr(e))| 

|5(5dr(e))| 

5 

Rcp(e) 

14 

\RT(Sdr(e))\ 

\R(Sdr(e))\ 

6 

indegee(Sdr(e ))  (#  users 
sending  emails  to  Sdr(e )) 

15 

\RT{Sdr(e))\ 

\S(Sdr(e))\ 

16 

\KB($dr{e))\ 

\R(Sdr(e))\ 

7 

outdegee(Sdr(e ))  (#  users 
receiving  emails  from  Sdr(e )) 

17 

Avg  response  time  for 
emails  in  RT(Sdr(e )) 

8 

indegree(Sdr(e))+ 
outdegree(Sdr(e )) 

18 

Avg  response  time  for 
emails  in  RB(Sdr(e )) 

Table  3  lists  the  email  features  used  in  our  classifier.  For  each  email 
feature  /&,  we  derive  a  corresponding  comparative  feature  /£  of  an  email 
pair  (ei,ej)  by  [(ej,ej)./£  =  et.fk  —  £j-fk-  For  email  send  time  t(e)  feature, 
we  further  convert  the  positive  and  negative  comparative  feature  values  to  1 
and  -1  respectively.  Interaction  features  refer  to  set  of  features  derived  from 
the  sender  of  the  email  to  the  common  recipient  ur  as  shown  in  Table  4.  The 
behavior  features  refer  to  the  six  AM  and  six  RM  behavior  scores  of  email 
senders.  The  comparative  interaction  and  behavior  features  are  defined 
similar  to  that  of  email  features. 


2.3  Experiments  -  Analysis  and  Comparison  of  Behavior  Mod¬ 
els 

The  first  set  of  experiments  is  to  evaluate  and  compare  the  four  types  of 
behavior  models  on  Enron  dataset.  To  compare  the  ranked  user  lists  pro¬ 
duced  by  two  models,  we  utilize  the  Kendall  r  distance  measure.  In  each 
ranked  list,  first  and  last  ranked  users  represent  the  most  and  least  engaging 
(or  responsive)  users  respectively.  Formally,  we  denote  the  rank  of  a  user  m 
in  a  ranked  list  L f.  by  The  Kendall  r  distance  between  two  ranked 

lists  L\  and  is  defined  as  such  that  K(Li,L2)  =  \ (v,i,Uj)  :  Ui  < 

2  Tt\TL  J. J 

Uj,  (h (m)  <  h(uj)  A  h{ui)  >  h(uj))  V  (h(ui)  >  h(v,j)  A  Z2(«i)  <  h{uj))\- 
Note  that  Kendall  r  distance  is  0  if  l\  =  I2  for  all  users,  and  1  if  there  is  no 


43 


Table  4:  Interaction  Features  I. 


No 

Description 

No 

Description 

19 

\E(Sdr(e)  — >  ur)\ 

27 

\RE(Sdr(e)^ur)\ 

\E(ur++Sdr(e))\ 

20 

\E(ur  — >  (Sdr(e))\ 

28 

rt((Sdr(e)  — >  ur ) 

21 

\E((Sdr(e)  -H-  ur)\ 

29 

rt{ur  ( Sdr(e )) 

22 

\RE{{Sdr{e)  — >  ur)\ 

30 

#  threads  involving  ( Sdr(e ), 
Uj  as  senders/recipients 

23 

\RE{ur  — >  ( Sdr(e))\ 

24 

\RE((Sdr(e)  -H-  ur)\ 

31 

#  threads  involving  ( Sdr(e ), 
ur  as  senders 

25 

\KE((Sdr(e)—tur)\ 

\E(ur—*-(Sdr(e))\ 

26 

\KE(ur—t(Sdr(e))\ 
\E((Sdr(e)— >ur)\ 

Table  5:  Kendall  r  distance  (. AM ,RM ). 


M= 

EC 

ER 

ET 

TC 

RG 

RW 

0.46 

0.52 

0.49 

0.46 

0.5 

0.11 

correlation  between  l\  and  h  [5,  7]. 

Correlation  between  engagingness  and  Responsiveness.  We  first 
show  the  correlation  between  engagingness  and  responsiveness  for  each  pro¬ 
posed  model.  Table  13  illustrates  the  Kendall  r  distance  of  engagingness 
and  responsiveness  ordered  lists  from  each  model.  The  r  distance  ranges 
between  0.4  and  0.5  for  most  models  (except  RW).  These  results  indicate 
that  engagingness  and  responsiveness  are  fairly  distinctive  behaviors.  Most 
users  would  receive  different  ranks  for  engagingness  and  responsiveness. 

Correlation  between  different  models.  Table  6  and  Table  7  show 
the  correlations  of  pairs  of  models  by  engagingness  and  responsiveness,  re¬ 
spectively.  Table  6  shows  that  the  different  engagingness  models  are  quite 
similar,  especially  email  count  model  (EC)  and  thread  count  model  (TC)3. 
This  is  due  to  most  email  threads  having  two  to  three  emails  each.  The 
similarity  across  different  models  is  even  more  prominent  for  responsiveness 
as  shown  in  Table  7.  Again,  the  EC  and  TC  models  show  high  correla¬ 
tion  in  the  responsiveness  ranking.  In  particular,  our  proposed  models  are 
correlated  by  responsiveness  rather  than  by  engagingness.  The  email  based 
models  such  as  ER  and  ET  are  highly  correlated  in  both  engagingness  and 
responsiveness.  On  the  other  hand,  the  random  walk  (WR)  model  appears 
to  rank  users  more  differently  from  all  other  models  in  both  engagingness 

3The  most  correlated  entry  is  shown  in  boldface  while  entries  <  0.05  are  underlined. 
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Table  6:  Kendall  r  distance  between  engagingness  models. 


A ^ 

A71" 

A^ 

ARW 

AEC 

0.14 

0.16 

0.01 

0.18 

0.22 

Aer 

0.12 

0.14 

0.15 

0.24 

AET 

0.16 

0.15 

0.22 

atc 

0.18 

0.22 

ARG 

0.24 

Table  7:  Kendall  r  distance  between  responsiveness  models. 


and  responsiveness.  This  is  not  a  surprise  due  to  its  rather  unique  way  of 
measuring  behaviors. 

Most  engaging  and  responsive  users.  Table  8  shows  the  top  five 
engaging  users  and  top  five  responsive  users  after  averaging  the  ranks  of 
our  proposed  models.  The  table  shows  that  the  two  sets  of  top  users  are 
different,  consistent  with  our  earlier  results.  It  is  interesting  to  note  that 
most  engaging  users  are  traders.  Other  than  CEO  John  Lavorato,  the  top 
responsive  users  are  general  employees. 


Table  8:  Top- 5  users  by  engagingness  and  responsiveness. 


engagingness 

Responsiveness 

Rank 

Enron  employee 

Position 

Enron  employee 

Position 

1 

Ryan  Slinger 

Trader 

John  Lavorato 

CEO 

2 

Larry  Campbell 

N/A 

Monika  Causholli 

Employee 

3 

Joe  Quenet 

Trader 

Jeff  Dasovich 

Employee 

4 

Mike  Swerzbin 

Trader 

Kate  Syrnes 

Employee 

5 

Jeff  King 

Manager 

Kay  Mann 

Employee 
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Table  9:  Results  of  email  reply  order  prediction. 


Features  used  in  SVM 

Average  Accuracy  (%) 

svme+i 

76.68 

SVMu 

77.31 

SVMb 

67.37 

SVM^+I 

65.33 

SVM(j 

69.78 

2.4  Experiments  -  Email  Reply  Order  Prediction  Accuracy 

Prediction  performance.  The  goal  of  this  experiment  is  to  evaluate  the 
performance  our  proposed  classification  approach  to  predict  email  reply  or¬ 
der.  We  also  want  to  examine  the  usefulness  of  engagingness  and  respon¬ 
siveness  behaviors  in  prediction  task.  There  are  five  SVM  classifiers  trained, 
namely:  (a)  using  comparative  email  and  interactive  features  (denoted  by 
SVMe+i);  (b)  using  comparative  behavior  features  only  (denoted  by  SVMb), 
(c)  using  all  features  (denoted  by  SVMu),  (d)  using  comparative  email  and 
interactive  features  except  t(e)  (denoted  by  SVMg^j),  and  (e)  using  all  fea¬ 
tures  except  t(e)  (denoted  by  SVMu).  Classifiers  (d)  and  (e)  are  included  as 
earlier  study  has  shown  that  email  replies  often  follow  the  last-in-first-out 
principle.  SVMg+][  and  SVMjj  allow  us  to  find  out  if  we  can  predict  without 
knowing  the  email  time  information. 

From  the  27,730  email  reply  relationships,  we  extracted  a  total  of  19,167 
email  pairs  for  the  prediction  task.  The  emails  in  each  pair  have  replies  that 
comes  after  the  two  emails  are  received  by  the  same  user.  For  each  email 
pair,  we  computed  feature  values  based  on  only  email  data  occurred  before 
the  pair.  In  addition,  we  used  complement  email  pairs  in  training.  The 
complement  of  an  email  pair  ( e,,ej )  with  class  label  c  is  another  email  pair 
(■ ej,ei )  with  class  label  c.  Five  folds  cross  validation  was  used  to  measure  the 

average  accuracy  of  the  classifiers  over  the  five  folds.  The  accuracy  measure 
is  defined  by  ^  correctly  classified  pairs 

Figure  9  illustrates  the  results  of  all  the  five  SVM  classifiers.  SVMu 
produces  the  highest  accuracy  of  77.31%  due  to  the  use  of  all  available  fea¬ 
tures.  By  excluding  the  email  arrival  order  feature,  the  accuracy  (of  SVM(j) 
reduces  to  69.78%.  This  performance  is  reasonably  good  given  that  random 
prediction  gives  an  accuracy  of  50%.  The  classifier  using  behavior  features 
only  (SVM®)  is  2%  more  accurate  than  that  with  email  and  interaction  fea¬ 
tures  without  email  arrival  order  feature  (SVMjg  ,  j).  The  above  results  show 
that  email  arrival  order  feature  is  an  important  feature  in  the  prediction 
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Table  10:  Top-10  features  for  SVM^j. 


Rank 

Feature 

Weight 

1 

Am‘  {Sdr(ei ))  —  AE1  (Sdr(ej)) 

0.66 

2 

RRG{Sdr(ei))  -  RRG\sdr(ej )) 

0.57 

3 

Indegree(Sdr(ei ))  —  Indegree(Sdr(ej)) 

0.54 

4 

ARW (Sdr(ei))  -  ARW(Sdr(ej )) 

0.53 

5 

#  threads  involving  U{ .  uj  as  senders 

0.47 

6 

RTC (Sdr(ei))  -  RTC(Sdr{ej )) 

0.46 

7 

AER(Sdr(ei ))  —  AER(Sdr(ej )) 

0.39 

8 

\E(Sdr(ei)  ur)  —  E(Sdr(ej)  — >  ur ) 

0.28 

9 

size(r(ei))  —  size(r(ej)) 

0.27 

10 

ARG(Sdr(ei))  —  ARG(Sdr(ej)) 

0.24 

task.  We  however  notice  that  behavior  features  contribute  to  prediction 
accuracy  especially  when  the  email  arrival  order  feature  is  not  available. 

Top  features.  Table  10  depicts  the  top  10  features  for  the  SVMu  clas¬ 
sifier.  The  table  shows  that  engagingness  based  on  the  email  reply  time 
model  RT  is  the  most  discriminative  feature.  Seven  out  of  ten  top  features 
are  behavior  features.  This  suggests  that  engagingness  and  responsiveness 
are  useful  in  predicting  email  reply  order. 

2.5  Discussions 

In  this  paper,  we  formulate  the  user  engagingness  and  responsiveness  behav¬ 
iors  in  an  email  network.  We  have  developed  six  behavior  models  based  on 
different  principles.  Using  the  Enron  data  set,  we  evaluate  these  models.  We 
also  apply  the  models  to  email  reply  order  prediction  task  and  demonstrate 
that  behavior  features  can  be  useful  in  this  task.  The  work  is  a  significant 
step  beyond  the  usual  node  and  network  statistics  to  determine  user  behav¬ 
iors  from  their  interactions.  While  our  results  are  promising,  there  are  still 
much  room  for  further  research.  Firstly,  behaviors  are  mutually  dependent 
and  we  plan  to  introduce  mutual  dependency  into  our  models.  Secondly, 
behaviors  can  be  localized  as  a  user  may  not  behave  the  same  towards  dif¬ 
ferent  users.  Some  users  may  be  more  responsive  to  friends  than  strangers. 
The  localized  behavior  models  should  therefore  be  explored. 
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3  Behavior  Modeling  in  Mobile  Social  Networks 


Mobile  social  networks  are  gaining  popularity  with  the  pervasive  use  of  mo¬ 
bile  phones  and  other  handheld  devices.  In  these  networks,  users  main¬ 
tain  friendship  links,  exchange  short  messages  and  share  content  with  one 
another.  From  the  social  communication  standpoint,  messaging  in  mobile 
social  networking  supplements  the  existing  face-to-face  or  phone  communi¬ 
cations  as  users  establish  social  relationships  with  one  another.  Cummings, 
Butler  and  Kraut  found  that  online  social  relationships  are  usually  weaker 
than  offline  social  relationships  [3].  In  their  work,  the  online  social  relation¬ 
ships  refer  to  those  established  through  emailing.  While  we  may  generalize 
the  results  to  relationships  established  through  messaging,  there  is  a  lack  of 
study  to  relate  messaging  behaviors  with  online  relationships  between  users, 
and  messaging  behaviors  with  social  status  of  users  in  an  online  community. 

In  this  part  of  research,  we  study  mobile  messaging  related  user  behaviors 
in  myGamma,  a  well  established  mobile  social  networking  site  that  supports 
both  friendship  links  and  messaging  services.  Again,  we  distinguish  two 
types  of  user  behaviors:  soliciting  active  responses  for  an  initiated  message 
(or  link)  and  responding  to  an  incoming  message  (or  link).  The  behaviors 
are  also  known  as  user  engagingness  and  responsiveness  respectively. 

Our  thesis  in  this  work  is  that  engagingness  and  responsiveness  behaviors 
are  related  to  the  social  status  of  users  in  a  friendship  network  as  well  as  their 
communication  patterns  with  other  users.  We  specifically  aim  to  answer  the 
following  interesting  research  questions:  (a)  How  can  we  tell  if  a  user  is 
engaging  or  responsive  from  his/her  messaging  activities?  (b)  How  are  a 
user’s  engagingness  and  responsiveness  behaviors  related  to  his/her  status 
in  friendship  networks?  (c)  Are  the  messaging  behaviors  related  to  topics  of 
messages?  If  so,  what  are  the  relationships  like? 

Modeling  user  behaviors  can  be  challenging  attributed  to  the  wide  vari¬ 
ety  of  messages  and  the  connectedness  among  users  in  the  messaging  net¬ 
works.  Messages  can  be  categorized  in  numerous  ways  based  on  its  for¬ 
mality,  sentiments,  and  content.  Instead  of  applying  natural  language  text 
understanding  techniques  on  the  message  content  which  is  usually  compu¬ 
tationally  costly  and  inaccurate,  we  want  our  messaging  behavior  models 
to  be  defined  upon  the  messaging  header  data  already  available  as  well  as 
the  ways  (friendship  links)  users  are  linked  to  one  another.  As  one’s  behav¬ 
iors  can  be  affected  by  all  his/her  neighbors,  the  messaging  behavior  models 
should  be  able  to  cope  with  all  the  inter-dependency  between  behaviors. 

Mobile  messaging  in  many  ways  are  similar  to  instant  messaging  popu¬ 
lar  among  web  users.  Both  support  real-time  synchronous  communications 
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whenever  users  are  online.  Mobile  messaging  however  has  the  additional 
feature  of  storing  incoming  messages  whenever  users  are  offline  so  that  the 
messages  can  be  read  when  the  users  become  online  again.  Such  a  feature 
enables  mobile  messaging  to  behave  like  email  messaging  which  supports 
mainly  asynchronous  communications.  As  noted  in  [9],  instant  messaging 
users  are  likely  to  communicate  with  few  acquainted  users  as  opposed  to 
strangers.  Mobile  messaging  is  also  different  from  instant  messaging  by  not 
restricting  the  communicating  users  to  be  friends  on  a  user’s  contact  list. 

The  above  differences  have  therefore  distinguished  our  work  from  the 
previous  works  that  focus  on  instant  messaging.  To  the  best  of  our  knowl¬ 
edge,  engagingness  and  responsiveness  are  behaviors  yet  to  be  studied  in 
mobile  social  networks,  particularly  in  large  scale.  The  work  presented  in 
this  paper  is  thus  early  efforts  in  this  direction.  Messaging  behaviors  of  users 
during  online  and  offline  periods  can  be  different  yet  related.  In  this  paper, 
we  demonstrate  that  a  user’s  online  (and  offline)  durations  can  be  estimated 
from  the  time  of  messages  sent  by  him/her.  From  the  online  durations,  we 
derive  the  online  and  offline  messaging  sessions  between  users  which  are  in 
turn  used  to  define  the  online  and  offline  messaging  behaviors. 

Our  contributions  can  be  summarized  as  follows: 

•  We  propose  several  quantitative  models  for  measuring  user  engaging¬ 
ness  and  responsiveness  in  both  online  and  offline  messaging  sessions. 
These  include  the  MsgCount,  ReplyTime,  SessionInit  and  Se¬ 
quence  models.  We  further  extend  these  models  to  incorporate  mu¬ 
tual  dependency  between  engagingness  and  responsiveness. 

•  We  apply  these  models  on  a  myGamma  dataset  containing  both  mes¬ 
sages  and  friendship  links  between  users.  Comparisons  between  engag¬ 
ingness  and  responsiveness,  and  comparisons  between  different  models 
have  been  made  using  this  real  dataset.  We  further  relate  the  two 
behaviors  with  number  of  friendships  users  enjoy. 

•  We  finally  show  that  engaging  and  responsive  users  play  important 
roles  in  messaging  topics  within  an  online  community.  We  apply  La¬ 
tent  Dirichlet  Allocation  [2]  to  uncover  latent  topics  from  our  message 
dataset.  We  discover  that  major  topics  in  the  community  are  driven 
by  engaging  and  responsive  users. 

3.1  Related  Work 

Synchronous  vs  asynchronous  messaging.  Messaging  is  a  mode  of 
communication.  Depending  on  whether  the  users  in  communication  are 
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physically  present  together  and  whether  they  are  able  to  receive  and  respond 
messages  in  realtime,  we  can  classify  messaging  services  to  be  synchronous, 
asynchronous,  or  semi-synchronous.  Instant  messaging  and  email  messaging 
are  representatives  of  synchronous  and  asynchronous  messaging  respectively. 
Mobile  messaging  is  more  a  mixture  of  both  and  is  thus  semi-synchronous. 
There  are  very  few  previous  efforts  on  studying  user  behaviors  in  email 
messaging.  In  [4],  user  responsiveness  behavior  is  defined  in  the  context 
of  replying  emails  of  the  same  subject  headings.  In  instant  and  mobile 
messaging,  message  structures  are  much  simpler  and  subject  heading  is  not 
longer  a  viable  grouping  criteria.  This  work  does  not  cover  the  engagingness 
behavior  nor  explore  different  responsiveness  behavior  models.  To  the  best  of 
our  knowledge,  there  is  no  other  research  on  modeling  messaging  behaviors. 

Instant  messaging  behaviors.  As  instant  messaging  is  very  similar 
to  the  myGamma’s  messaging,  we  examine  related  work  in  the  area.  Nardi, 
Whittaker  and  Bradner  found  that  instant  messaging  serves  largely  social 
purposes  instead  of  formal  information  exchanges  even  in  the  organization 
setting.  Avrahami  and  Hudson  studied  the  responsiveness  of  users  in  instant 
messaging[l].  The  responsiveness  here  refers  to  the  response  time  required 
for  a  user  to  respond  to  an  incoming  session  initiation  attempt  (SIA)  mes¬ 
sage.  The  SIA  message  is  an  incoming  message  from  a  sender  that  reaches 
a  user  long  after,  determined  by  some  threshold,  the  user  have  sent  the  pre¬ 
vious  message  to  the  sender.  Strictly  speaking,  the  responsiveness  concept 
here  is  not  a  user  behavior  but  some  response  time  label.  One  of  five  re¬ 
sponse  time  labels  are  assigned  to  each  message  replied  in  30  seconds,  1, 
2,  5  and  10  minutes  respectively,  and  the  prediction  models  proposed  could 
achieve  80  to  90%  accuracy  in  assigning  response  time  labels. 

Unlike  [1],  we  focus  mainly  on  mobile  messaging  related  user  behav¬ 
iors.  Due  to  the  peculiar  nature  of  mobile  messaging,  we  have  to  perform 
classification  of  online  and  offline  periods  for  each  user.  Instead  of  treat¬ 
ing  responsiveness  as  message  response  time,  we  study  responsiveness  as  a 
quantitative  user  characteristics.  We  also  introduce  engagingness  as  another 
user  characteristics.  Our  work  have  also  involved  a  much  larger  dataset. 

3.2  Preliminaries 

Mobile  messaging  users  communicates  with  one  another  using  a  mixture  of 
online  and  offline  messaging  sessions.  When  a  user  and  his  contact  are  on¬ 
line,  they  can  exchange  exchanges  with  each  other  in  realtime.  This  mode 
of  messaging  is  similar  to  instant  messaging  which  supports  highly  syn¬ 
chronous  communication.  On  the  other  hand,  a  mobile  messaging  user  can 
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also  send  messages  to  another  user  if  the  latter  is  offline.  Such  messages  are 
stored  and  are  retrieved  when  the  recipient  becomes  online  again.  Such  a 
messaging  mode  is  more  similar  to  emailing  and  text  messaging  which  are 
representatives  of  asynchronous  communication. 

With  both  synchronous  and  asynchronous  communication  taking  place 
in  mobile  messaging,  a  mixture  of  messaging  behaviors  can  exists  for  the 
same  users.  To  study  these  messaging  behaviors  separately,  we  take  the 
following  steps: 

•  Step  1  (determine  the  online  and  offline  durations  of  users):  Each 
user  may  be  online  or  offline  when  using  mobile  messaging.  For  cases 
where  online  and  offline  durations  of  users  are  not  logged,  we  need 
determine  these  durations  automatically  based  on  time  gaps  between 
consecutive  messages.  As  every  user  has  his/her  messaging  pattern,  a 
personalized  approach  to  determination  of  online  and  offline  duration 
will  be  required.  A  detailed  description  of  our  proposed  approach  is 
given  in  Section  3.3. 

•  Step  2  (identify  the  online  and  offline  messaging  session  between  users) : 
Once  the  users’  online  durations  are  determined,  we  proceed  to  derive 
the  online  and  offline  messaging  sessions  between  every  communicating 
pair  of  users  (see  Section  3.4).  At  the  end  of  this  step,  each  user  pair 
may  have  zero  or  more  online/offline  sessions. 

Table  11  defines  the  notations  to  be  used  in  the  rest  of  paper.  A  message 
m!  is  said  to  be  the  reply  of  a  m  if  it  is  the  earliest  message  that  has 
Sdr(m!)  =  Rcp(m),  Rcp(m')  =  Sdr(m),  and  t(m!)  >  t{m). 

3.3  Determination  of  Online  and  Offline  Status 

Determining  the  online  and  offline  communication  for  mobile  messaging 
users  is  a  non-trivial  task.  In  the  absence  of  a  log  of  user  online  status 
over  time,  we  have  resort  to  a  statistical  approach  to  automatically  decide 
the  online  and  offline  periods  of  each  user  as  he  or  she  uses  the  messaging 
service. 

Our  main  proposed  idea  of  segmenting  messages  into  online  and  offline 
messages  is  based  on  a  Gaussian  Mixture  Model.  In  this  model,  we  en¬ 
visage  that  users  send  out  messages  at  different  rates  depending  on  whether 
they  are  online  or  offline.  We  first  define  a  random  variable  X  for  the  time 
gap  between  two  consecutive  messages  sent  by  all  users.  Assume  that  X  is 
formed  by  two  clusters  of  time  gaps,  i.e.,  online  and  offline.  X  can  be  mod¬ 
eled  by  a  mixture  of  two  Gaussian  distributions  erf)  and  Af(p-2,  o"|) 


51 


Table  11:  Notations. 


SE(ui ) 

Messages  sent  by  user  m 

RE(ui) 

Messages  received  by  m 

RB(ui) 

Messages  replies  sent  by  Ui 

RT(ui ) 

Messages  replying  to  Wj’s  earlier  messages 

OnPi 

Online  periods  of  Ui 

OffPi 

Offline  periods  of  vn 

S  a 

Online  sessions  between  Ui  and  Uj 

S  a 

Offline  sessions  between  Ui  and  Uj 

r{m) 

Reply  to  message  m 

Sdr(m ) 

Sender  of  message  e 

Rcp(m ) 

Recipient  of  message  m 

f(m) 

Sent  time  of  message  to 

M,-); 

Messages  from  m  to  Uj 

M  ij 

Messages  between  ut  and  Uj 

where  [i \  and  represent  the  mean  time  gaps  of  the  two  distributions  re¬ 
spectively,  while  a  1  and  02  represent  the  standard  deviations  respectively. 
We  want  to  learn  these  parameters  that  generate  distributions  fitting  our 
dataset. 

Suppose  we  have  N  number  of  observed  samples.  Let  xn  denote  the 
nth  observed  sample  and  M{xn ;  [ik,ak)  denote  the  probability  that  xn  is  in 
cluster  k.  Let  nk  G  [0, 1]  be  the  size  of  cluster  k.  We  use  EM  algorithm  to 
solve  for  the  values  of  nk,  [A k  and  ak  as  follows: 


(  ,  TTkN(xn-Hk,G2k) 

Ej=l 

(14) 

_  En=l  f(n,k) 
nt=  I V 

(15) 

EjLi  Xnf(n,  k) 

Eli  /(».*) 

(16) 

2  En=l  f(n,k)(xn-  Hk)2 

EJU /(",*) 

(17) 

Once  the  parameters  are  learnt,  the  Gaussian  distribution  with  smaller 
[ik  models  the  time  gaps  between  send  messages  when  users  are  in  online 
periods  while  another  Gaussian  distribution  models  the  time  gaps  when 
users  are  in  offline  periods.  We  also  derive  a  time  gap  threshold  7  to  easily 
classify  time  gaps  into  online  and  offline  periods. 
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3.4  Online  and  Offline  Sessions 

A  message  session  s  between  two  users  ut  and  Uj  is  defined  by  a  set  of 
consecutive  messages  between  them.  Due  to  the  different  online  and  of¬ 
fline  messaging  behaviors,  we  further  divide  sessions  into  online  and  offline 
sessions. 

Given  a  set  of  messages  M y  between  Ui  and  Uj,  and  the  online  periods 
of  m  and  uj  denoted  by  OnPi  =  {[fsa,  ten],  ■  ■  ■  ,  [ts^,  te^J}  and  OnPj  = 
{[tsji,teji\,  ■■■  ,  [tsjkj,tejkj]}  respectively. 

The  set  of  overlapping  online  periods  between  m  and  Uj ,  Pij ,  is  defined 
by: 


OlpPij  =  OnPi  n  OnPj 

=  {[ max(tsi,tsj),min(tei,tej)]\[tsi,tei ]  G  OnPi, 

[: tsj,tej ]  G  OnPj ,  (tsi  >  tej)  A  (tsj  >  tei )} 

The  set  of  online  sessions  between  ut  and  Uj ,  S ij,  is  then  defined  as 
a  collection  of  message  sets  induced  by  the  overlapping  online  periods  such 
that  each  message  set  consists  of  at  least  some  exchange  of  messages  between 
Ui  and  uj. 

S ij  =  {Mij(p)\p  G  OlpPij  A 

(3m,  m'  G  M ij(p),m  =  r(m))} 

where  M,j(p)  =  { m  G  Mtj\t(rn)  G  p}. 

The  set  of  online  session  intervals  between  ut  and  Uj,  OnSsnPij,  is  thus 
the  set  of  overlapping  online  periods  that  cover  online  sessions,  i.e.: 

OnSsnPij  ={p£  OlpPij\3m,m  G  M ij(p),m'  =  r(m)} 

From  the  online  session  intervals,  we  derive  the  remaining  periods  as: 

RemPij  =  [ min(ts * ,  ts* ) ,  max(te* ,  ts* )]  —  OnSsnPij 

where  ts*  (ts*)  and  te*  ( te *)  denote  the  minimum  tsi  ( tsj )  and  maximum 
tei  ( tej ),  respectively,  in  OnPi  (OnPj). 

The  set  of  offline  sessions  Stj  is  then  defined  as  a  collection  of  message 
sets  induced  by  the  remaining  periods  such  that  each  message  set  consists 
of  at  least  some  exchange  of  messages  between  Ui  and  Uj. 

S ij  =  {Mij(p)\p  G  RemPij  A 

(3m,  m'  G  M ij(p),m'  =  r(m))} 
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Figure  5:  Online/ Offline  Periods  and  Sessions 


The  set  of  online  session  intervals  between  m  and  Uj ,  OffSsnP^,  is  thus 
the  set  of  remaining  periods  that  cover  online  sessions,  i.e. : 

OJJSsnPij  =  {p  e  RemPij\3m,m!  G  M ij(p),m!  =  r(m )} 

The  start  and  end  times  of  a  session  s  refer  to  the  times  of  the  first  and 
last  messages  respectively.  The  user  who  sends  the  first  message  of  s  is  also 
known  as  the  initiator  of  the  session. 

Consider  the  example  shown  in  Figure  5.  Users  Ui  and  Uj  have  two  online 
periods  each.  The  messages  directed  between  them  are  the  ones  exchanged 
between  m  and  Uj.  The  messages  directed  away  from  them  are  sent  to 
other  users.  Although  iij  and  Uj  are  both  online  in  the  left  overlapping 
period,  it  does  not  constitute  an  online  session  due  to  a  lack  of  message 
exchange  between  them.  The  only  online  session  between  m  and  Uj  is  thus 
{mg,mio,mn,mi2}-  Among  the  two  remaining  periods,  only  the  left  one 
has  message  exchanges  between  m  and  Uj.  Hence,  the  offline  session  found 
is  {m3,m6,m7,m8}. 


3.5  Mobile  Social  Network  Dataset 

In  the  myGamma  mobile  social  networking  site,  members  interact  and  form 
online  communities.  Most  members  are  young  adults  between  the  age  of  20 
to  30.  The  myGamma  dataset  we  obtained  consists  of  194,809  users  and 
2.7M  messages  among  them  within  the  one-month  period  from  September 
8,  2009  to  September  10,  2009.  We  first  selected  the  users  with  at  least 
one  friendship  link  as  not  all  users  specify  their  friendships.  Other  than 
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Table  12:  Dataset  Statistics. 


Users 

14,423 

Messages 

1,441,272 

Sessions 

72,297 

Online  sessions 

5,491 

Offline  sessions 

66,806 

Users  participating  sessions 

10,346 

Users  participating  online  sessions 

4,441 

Users  participating  offline  sessions 

10,096 

Users  initiating  sessions 

9,408 

Users  initiating  online  sessions 

3,035 

Users  initiating  offline  sessions 

9,186 

Messages  in  sessions 

199,073 

Messages  in  online  sessions 

12,318 

Messages  in  offline  sessions 

186,755 

Friendship  links 

1,795,674 

Foe  links 

109,510 

Message  links 

1,196,011 

friendship  network,  we  have  message  links  between  users  forming  the  mes¬ 
sage  network.  A  message  link  from  user  m  to  user  Uj  is  defined  when  there 
is  at  least  one  message  from  Ui  to  uj.  We  further  selected  the  users  who 
have  sent  at  least  4  messages  and  received  at  least  4  messages.  This  way,  we 
obtained  a  final  dataset  with  14,423  users  with  1,196,011  messages  among 
them.  Within  this  set  of  messages,  236,798  are  replies  to  some  messages  in 
the  set.  Table  12  summarizes  the  statistics  of  this  final  dataset. 

We  apply  Gaussian  Mixture  Model  on  the  dataset  to  determine  the  on¬ 
line  and  offline  periods  of  users.  To  avoid  bias  against  time  gap  threshold 
introduced  by  users  who  send  very  few  (one  or  two)  messages,  we  sample 
the  time  gaps  from  users  who  have  at  least  100  messages  each.  There  are 
3520  such  users.  The  time  gap  threshold  7  obtained  is  around  4  hours  (see 
Figure  6).  The  threshold  is  subsequently  applied  to  the  final  dataset  to 
obtain  online  and  offline  sessions  with  numbers  shown  in  Table  12. 

3.6  User  engagingness  and  Responsiveness  for  Mobile  Mes¬ 
saging 

In  this  section,  we  will  introduce  four  pairs  of  basic  engagingness  and  respon¬ 
siveness  behavior  models,  namely  MsgCount,  ReplyTime,  SessionInit, 
and  Sequence.  They  are  designed  based  on  message,  reply  time,  session 
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Figure  6:  Two  Gaussian  Mixture  Model  for  Determining  Online  and  Offline 
Periods  in  MyGamma  Dataset 


and  messaging  sequence  data  respectively.  Each  model  assigns  an  engaging¬ 
ness  (responsiveness)  score  E  [0, 1]  to  each  user,  0  for  non-engaging  (non- 
responsive)  user  and  1  for  fully  engaging  (fully  responsive)  user.  As  users 
may  demonstrate  different  messaging  behaviors  during  online  and  offline 
sessions,  every  model  except  SessionInit  has  both  online  and  offline  ver¬ 
sions.  For  example,  the  online  and  offline  session  versions  of  MsgCount 
are  MsgCountd„  and  MsgCount0jj  respectively.  For  SessionInit  model, 
only  the  online  version  is  applicable  as  it  involves  the  online  sessions  only. 

MsgCount  Model:  This  model  is  designed  based  on  the  principle  that 
an  engaging  user  should  have  most  of  his/her  messages  replied  by  other 
users,  while  a  responsive  user  should  have  most  of  his/her  received  mes¬ 
sages  replied.  The  engagingness  and  responsiveness  scores,  ^4MsgCount  an(^ 
ji'MsGCocN'F  for  online  and  offline  sessions  are  thus  defined  by: 


^MSGCOUNT 


RTx(ui)  | 
SEx(ui)  | 


(18) 


oMsgCount/ 
Kx  [ Ui 


RBx(uj)\ 

REx(ui)\ 


(19) 


where  session  type  x  can  be  online  or  offline  denoted  by  on  and  off  respec¬ 
tively. 

ReplyTime  Model:  Unlike  MsgCount,  this  model  examines  the  reply 
times  of  messages  to  determine  user  engagingness  and  responsiveness.  An 
engaging  user  should  have  his/her  messages  quickly  replied  by  others  while 
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a  responsive  user  should  have  received  messages  quickly  replied.  Given  a 
message  m!  which  is  a  reply  of  message  m,  i.e.,  m!  =  r(m),  the  reply  time  of 
m>,  is  rt{m!)  =  t{m!)  — t(m ).  The  z-normalized  reply  time  rt{m!)  is  defined 
by  rt —  where  rt  and  art  are  the  mean  and  standard  deviation  of  reply 
time  respectively.  Now,  we  define  the  engagingness  and  responsiveness  of 
ReplyTime  model  as: 


a  ReplyTime 
-™-x 


(«i) 


1 

\SEx(Ui)\ 


/(rt(m0) 

meSEx(ui) 

m'=r(m) 


(20) 


where 


c>ReplyTime/„ 
Kx  { Ui 


1 

REx{ui)  | 


meREx(ui) 

r(m)=m' 


/(*)  = 


1  +  e~x 


(21) 


(22) 


The  function  /()  is  designed  to  convert  the  normalized  reply  time  to  the 
range  [0,1]  with  0  and  1  representing  extreme  slow  and  extreme  fast  reply 
times  respectively. 

Sessionlnit  Model:  In  this  model,  we  adopt  the  principle  that  an 
engaging  user  is  more  likely  to  initiate  online  messaging  sessions  for  the 
messages  he/she  sends  out,  while  a  responsive  user  is  more  likely  to  partici¬ 
pate  in  online  sessions  initiated  by  messages  from  others.  We  first  denote  the 
number  of  online  session  initiating  and  participating  messages  of  a  user  Ui  by 
S snlnitM sg{ui)  and  SsnMsg(v,i )  respectively.  These  are  the  first  messages 
of  online  sessions.  SessionInit  Models  for  engagingness  and  responsiveness 
are  then  defined  as: 


a  SessionInit  /  \  _  _ \S  snlnitM  sg(uj)\ _ 

1  \S  snlnitM  sg{ui)\  +  \SEx(ui)  —  SsnMsg{ui)\ 

(23) 


e>SessionInit/„, 

Ron  [Ui 


Y^j  | S snlnitM sg{uj)  n 

Yj  | S snlnitM sg(uj)  n  Mj-^l  +  [M^j  —  SsnMsg(uj)\ 

(24) 


where  S snlnitM sg{uj)  n  represents  the  set  of  messages  from  uj  to  m 

that  successfully  initiate  online  sessions  with  iq,  and  —  SsnMsgiuj) 
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represents  the  set  of  messages  from  uj  to  Ui  that  fails  to  initiate  online 
sessions  with  rq. 

Sequence  Model.  Message  sequence  refers  to  the  sequence  of  messages 
sent  and  received  by  a  user  ordered  by  time.  To  derive  engagingness  and 
responsiveness  from  message  sequences,  we  consider  the  principle  that  an 
engaging  user  is  expected  to  have  his  or  her  sent  messages  replied  soon  after 
they  are  received  by  the  message  recipient,  and  a  responsive  user  replies  soon 
after  they  receive  messages.  As  the  time  taken  to  reply  an  message  may  vary, 
we  consider  the  number  of  messages  received  later  than  a  message  m  but 
are  replied  before  m  by  a  user  as  a  proxy  of  how  soou  rn  is  replied. 

The  above  principle  is  thus  used  to  develop  the  Sequence  Model.  Let 
seqx,i  denote  the  online  ( x  =  on)  or  offline  ( x  =  off )  session  message 
sequence  of  user  Ui.  When  a  message  received  by  Ui  is  replied  before  other 
message(s)  received  earlier,  the  reply  of  the  former  is  known  as  an  out-of- 
order  reply.  Formally,  for  a  message  m  received  by  Ui,  we  define  the  number 
of  messages  received  and  number  of  out-of-order  replies  between  m  and  its 
reply  rri'  in  seqx^,  denoted  by  nx>r(ui,m )  and  nXXJ(ut.  rn)  respectively,  as 


#  messages  received  between 
nxr{ui,m)  =  ^  m  and  m'  in  seqx 
-1, 


if  3m'  G  RTx(ui), 

r(m)  =  rn'  (25) 

otherwise 


no{ui,m) 


ff  messages  received 
between  rn  and  rn'  in  seqx^ 
and  have  been  replied, 

.  -1, 


if  3m'  G  RTx(ui), 
r{m)  =  m' 

otherwise 


(26) 


The  —1  value  is  assigned  to  nx  r  and  nX)0  when  m  is  not  replied  at  all. 
The  user  engagingness  and  responsiveness  of  the  Sequence^,  model  are  thus 
defined  as: 


^jSeqc 


Ui 


£ 


m£SEx(ui),Uj=Rcp(m) 


(i 


SEx(Ui)  | 


nx,r(uj,m) ' 


(27) 


^Sequence/ 
-H'#  \ai 


£ 


m&REx{ui) 


(1 


y'tn')  \ 

nx,r(ui,m)  / 


REx(ui)  | 


(28) 


3.7  Mutual  Dependency  Based  Models 

In  the  above  basic  models,  user  engagingness  and  responsiveness  are  com¬ 
puted  independently.  They  share  the  same  underlying  assumption  that  mes¬ 
saging  behaviors  of  a  user  is  independent  of  other  users.  This  assumption 
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Table  13:  Correlation  of  engagingness  models  in  online  sessions. 


ART 

art« 

0.86 

0.98 

0.99 

0.86 

0.86 

0.73 

0.86 

arT 

0.85 

0.86 

0.99 

0.99 

0.79 

0.99 

ASI 

0.98 

0.85 

0.85 

0.75 

0.85 

asq 

0.86 

0.86 

0.74 

0.86 

amc* 

0.99 

0.79 

0.99 

art* 

0.79 

0.99 

asi* 

0.79 

Table  14:  Correlation  of  responsiveness  models  in  online  sessions. 


flRT 

Kyi 

hrt« 

hmij 

0.85 

0.98 

0.99 

0.86 

0.85 

0.97 

0.86 

Rrt 

0.81 

0.86 

0.99 

0.99 

0.88 

0.99 

RS1 

0.98 

0.81 

0.81 

0.99 

0.81 

rsq 

0.86 

0.86 

0.97 

0.86 

rmc* 

0.99 

0.88 

0.99 

flRT* 

0.88 

0.99 

flSI* 

0.88 

does  not  always  hold  in  practice  as  user  behaviors  are  likely  to  be  affected 
by  other  users  he  or  she  communicates  with.  Hence,  we  have  designed  the 
mutual  dependency  based  engagingness  and  responsiveness  models. 

Suppose  AM(ui )  and  RM  (ui)  are  engagingness  and  responsiveness  of  user 
Ui  computed  using  model  M .  The  mutual  dependency  between  AM  and  RM 
can  be  expressed  as: 


•  A  user  is  considered  more  engaging  if  he/she  can  get  less  responsive 
users  to  respond.  Formally,  we  write: 


AM*{Ui) 


Uj  •  (1  -  RM{uo)) 

\SEx{Ui)\ 


(29) 


A  user  is  considered  more  responsive  if  he/she  responds  to  less  engag¬ 
ing  users. 


RM*(ui)  = 


\REx(ui)\ 


(30) 


where  and  denote  the  quantity  values  between  m  and  uj  com¬ 

puted  based  on  the  principle  of  M  (i.e. ,  #  of  replies  between  Ui  and  Uj  in 

A?c{«  0). 
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Table  15:  Correlation  of  engagingness  and  responsiveness  models  in  online 
sessions. 


Model 

Spearman’s  rho 

Model 

Spearman’s  rho 

MC 

0.83 

MC* 

0.75 

RT 

0.75 

RT* 

0.75 

SI 

0.78 

SI* 

0.72 

SQ 

0.83 

SQ* 

0.75 

4  Experiment  Results  -  Comparison  of  Messaging 
Behaviors 

For  comparison  between  user  behavior  models,  we  compare  by  examining 
Spearman’s  rank  correlation  coefficient.  The  Spearman’s  rho  of  two  ranked 
list  l\  and  I2,  p(h,l2)  is  defined  by: 

6  Vd2 

P(h,h)  =  l-  *  (31) 

n(nz  —  1) 

where  l\  and  I2  have  n  users’  ranks  and  the  difference  dUi  =  h(ui)  —  h(ui) 
between  the  ranks  of  user  ip  on  l\  and  Z2.  p  value  falls  between  -1  and  1 
representing  negative  correlation  and  positive  correlation  respectively.  In 
addition,  p  =  0  stands  for  no  linear  correlation. 

Comparison  between  user  engagingness  (responsiveness)  mod¬ 
els.  Table  13  (Table  14)  shows  the  Spearman’s  rho  between  the  ranked 
lists  produced  by  different  engagingness  (responsiveness)  models  for  online 
sessions.  The  table  shows  that  most  engagingness  (responsiveness)  models 
are  very  similar  to  one  another  except  ASI  and  ASI  which  are  slightly  more 
different.  This  is  because  of  the  principle  of  the  SessionInit  Model  which  is 
distinct  from  the  other  models.  In  the  SessionInit  Model,  the  engagingness 
of  a  user  will  be  high  when  the  user  tends  to  initiate  a  number  of  sessions. 
However,  it  turns  out  that  most  users  usually  initiate  a  small  number  of 
sessions  in  the  myGamma  dataset.  Though  not  shown  here,  we  also  observe 
the  same  for  engagingness  (responsiveness)  in  offline  sessions. 

Comparison  between  engagingness  and  responsiveness.  Next, 
we  examine  the  difference  between  engagingness  and  responsiveness  for  dif¬ 
ferent  models  for  online  sessions.  As  shown  in  Table  15,  the  Spearman’s 
rho  values  between  the  two  behaviors  of  the  same  model  are  mostly  more 
different  than  differences  observed  between  two  models  for  the  same  behav¬ 
ior  (say,  engagingness).  The  only  exception  is  SessionInit  model.  This 
can  be  relatively  sparser  data  for  measuring  the  model.  Interestingly,  for 
offline  sessions,  we  observe  that  the  distinction  between  engagingness  and 
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Top-k%  Responsive  Users 


Figure  7:  Engagingness/responsiveness  and  friendship  links. 


responsiveness  is  less  obvious.  This  could  be  due  to  offline  nature  (i.e. ,  long 
time  lag)  of  responding  messages  between  users. 

Engagingness/responsiveness  and  friendship  links  Figure  7  de¬ 
picts  the  boxplots  of  number  of  bi-directed  friendship  links  of  users  divided 
into  five  different  engagingness/responsiveness  intervals  of  size  0.2.  Here,  we 
derive  the  overall  engagingness  (responsiveness)  of  each  user  by  averaging 
the  engagingness  (responsiveness)  of  different  models  (including  online  and 
offline  versions).  We  observe  that  users  with  higher  engagingness  have  more 
friendship  links.  This  is  less  obvious  for  responsiveness.  This  suggests  that 
engaging  users  are  more  capable  of  attracting  and  establishing  friendships. 

5  Experiment  Results  -  Topic  Specific  Messaging 
Behavior  Analysis 

5.1  Motivation 

Users  demonstrate  different  messaging  behaviors  in  different  topics  of  dis¬ 
cussion.  For  interesting  topics,  one  expect  users  to  be  more  engaging  and 
responsive,  while  uninteresting  topics  will  only  turn  users  away  from  par¬ 
ticipation.  In  this  section,  we  analyze  user  engagingness  and  responsiveness 
for  different  message  topics  in  our  dataset.  The  purpose  here  is  to  identify 
interesting  topics  within  the  online  community. 

To  conduct  this  study,  we  first  identify  the  major  message  topics  from 
the  aggregated  message  content  for  a  set  of  users  using  Latent  Dirichlet 
Allocation  (LDA)  [2] .  We  then  analyze  the  distribution  of  engagingness  and 
responsiveness  of  users  within  each  message  topic. 
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Table  16:  Major  Topics. 


Topics 

Top  10  terms 

T14 

T15 

T17 

love,  chat,  hello,  want,  dear,  baby,  friend,  dont,  hope,  miss 
dear,  chat,  sana,  sawa,  doin,  kwani,  swty,  pliz,  thea,  sasa, 
view,  blkapp,  mode,  click,  gift,  return,  gifts,  love,  private,  thank 

5.2  Message  Topic  Distillation 

For  our  analysis  purpose,  we  only  select  users  indicating  English  as  their 
preferred  language  and  there  are  only  27,920  such  users.  Despite  this  prun¬ 
ing  effort,  there  are  still  some  users  writing  non-English  messages  as  shown 
in  our  results.  Due  to  the  limited  content  in  each  message,  we  aggregate 
the  messages  by  their  senders  and  recipients.  Messages  sent  by  a  user  cap¬ 
ture  the  topics  in  which  he/she  is  interested  to  communicate  with  others. 
On  the  other  hand,  messages  received  by  a  user  represent  the  topics  about 
which  others  wish  to  communicate  with  him/her.  We  call  the  two  aggre¬ 
gated  message  content  the  out-document  and  in-document  of  the  user.  We 
also  remove  stop  words  from  these  content  using  a  combined  dictionary  of 
400+  stop  words  from  [6].  Given  a  set  of  documents  and  k  topics,  LDA  es¬ 
sentially  finds  the  k  latent  topics  in  the  documents  such  that  each  document 
is  assigned  a  topic  distribution,  and  each  word  occurrence  in  the  document 
is  assigned  a  topic.  Since  topics  are  not  given  beforehand,  we  performed 
LDA  on  the  merged  set  of  out-documents  and  in-documents  with  k  =  20 
common  topics.  The  empirical  choice  of  k  =  20  appears  to  work  well  as  we 
could  find  the  popular  topics  exist  in  the  data. 

The  topic  distillation  results  are  shown  in  Table  16.  A  uniform  topic 
distribution  assumption  for  users  would  have  0.1  assigned  for  each  topic. 
Among  the  20  topics,  most  have  only  a  few  hundreds  of  users  (e.g.,  topic  1 
has  141  users),  while  topics  14,  15,  and  17  have  27,741,  17,088,  and  4,780 
users  respectively.  We  call  these  users  the  main  users.  We  empirically  select 
topics  14,  15  and  17  as  the  major  topics  as  they  have  much  more  main  users. 
The  remaining  topics  are  thus  the  non-major  topics. 

To  conserve  space,  we  only  show  the  top  10  terms  found  in  the  three 
major  topics.  Topic  14,  the  largest  topic  in  term  of  main  user  count,  consists 
of  mainly  greeting  terms.  This  is  not  a  surprise  as  users  tend  to  greet 
one  another  in  such  a  social  network.  Topic  15  appears  to  be  dominated 
by  abbreviated  (e.g.,  “doin’’ =  “doing” ,  “swty”  =  “sweety” )  and  non-English 
terms  (e.g.,  “sana”,  “sewa”,  “kwani”).  Topic  17  is  likely  to  be  related  to  use 
of  software  and  exchange  of  gifts. 
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(a)  Major  Topics  (b)  Non-Major  Topics 

Figure  8:  Average  Topic  Probability  Distribution. 


5.3  Messaging  Behaviors  in  Message  Topics 

We  would  now  like  to  examine  the  distinction  between  engaging  (or  respon¬ 
sive)  users  and  other  users  in  both  major  and  non-nrajor  topics. 

Figure  8a  shows  the  boxplots  of  top  10%  engaging  (responsive)  users’  av¬ 
erage  major  topic  probabilities  and  those  of  non-top  engaging  (responsive) 
users.  The  average  major  topic  probability  of  a  user  is  derived  by  averag¬ 
ing  the  topic  probabilities  of  his/her  out-documents  (in-documents)  for  the 
major  topics  (i.e. ,  Topics  14,  15  and  17).  Similarly,  we  derive  the  average 
non-major  topic  probability  of  each  user  in  Figure  8b.  Figure  8a  shows  that 
the  top  10%  engaging  users  contribute  more  to  the  major  topics  than  the 
other  users.  On  the  other  hand,  the  former  contribute  less  on  average  to 
the  non-major  topics  than  the  other  users  as  shown  in  Figure  8b.  From  the 
figures,  we  also  observe  the  major  topics  enjoy  more  user  contribution  than 
non-nrajor  topics  in  general.  We  also  examine  the  average  topic  probability 
of  top  10%  responsive  users  and  non-top  10%  responsive  users  for  major 
topics  and  non- major  topics  in  Figure  8  showing  similar  results  to  engag¬ 
ing  users.  On  the  whole,  the  results  match  our  intuition  that  engaging  and 
responsive  users  are  the  ones  driving  important  topics  in  the  online  commu¬ 
nity.  That  is,  the  former  tends  to  generate  messages  of  major  topics  while 
the  latter  tends  to  receive  messages  of  major  topics. 

6  Conclusions 

The  project  has  so  far  examined  engagingness  and  responsiveness  behaviors 
in  two  datasets.  It  also  resulted  in  two  publications  [11,  10]. 
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